Hello and welcome to my first blog post on the subject of Data Science Examples. This blog is intended to show simple data science examples using different technologies. Examples are random things I have dabbled with to learn solutions by using different IT technologies. I’ll reference my sources, and hopefully there is a post or two that can help someone out with a question or project they are working on. The examples are technologies I have worked on and/or I am interested in. These technologies include Java, Oracle, MongoDB, R, HBase, MapReduce and Spark. I have used Java and Oracle products in my professional career. MongoDB, R, HBase, MapReduce and Spark I have not used professionally, but have studied and very interested in the technologies.
Hi – I’m LeSean
I’m an IT professional living and working in a St. Paul suburb of Minnesota, USA. I have been in the IT industry for over 20 years now. Over my professional career I have worked for five different large companies and have worked operations and development teams. Operations teams have included server administration (Novell and Microsoft), desktop engineering and WebSphere administration. Development teams have been mostly Java and Oracle technologies. I graduated with a Bachelor of Science from the University of Minnesota (Twin Cities campus) in December 1996. In May 2015, I graduated with a Master of Science in Software Engineering from St. Thomas University (St. Paul campus).
My “other” interests outside of IT are… Well, not much. 😉 But I am also a sports fan – primarily horse racing, football, baseball and hockey. And then other things that are part of life – family/friends gatherings, reading, movies and so on.
My Start with Data Science
My first two years (eight courses) in graduate school consisted of the general/required courses for the degree. The remaining courses in the degree program were electives. Although the required courses were good, I was questioning a little bit whether the investment in terms of time and money was worth it for me to go through this degree program.
During my second year, Big Data (Hadoop, NoSQL) technologies were emerging in industry. St. Thomas University works with local businesses update courses and degree curriculum to stay current with industry. St. Thomas also had an Academic Partnership with Cloudera, a leading company in data analytics. St. Thomas created new courses in Data Science for Hadoop programming, NoSQL databases and data visualization. The courses were challenging, but very interesting to someone who is primarily worked on Java/Web applications. The Data Science courses provided experience with these new technologies that I would not have got from projects I was working on with my regular job.
NoSQL and Unstructured Data Interest
I came up with an odd conspiracy theory when I first started using Facebook. The theory basically stated Facebook was created by a group of women to help men remember birthdays and anniversaries. As I learned more about Facebook I realized that my initial theory probably was not the case (although I haven’t totally dismissed that theory yet). Being open minded I learned Facebook is a good communication tool for interesting people with exciting lives to share adventures, funny links to stories or video feeds, post cute pictures of kids and so on. Although I don’t fall into that demographic, I now definitely see the value in Facebook.
I had a similar feeling when I first heard about NoSQL databases and unstructured data. I thought NoSQL technology is for developers and IT organizations that can’t define requirements or have proper change management controls to deal with changes to their systems. Just grab a blob of unstructured data, throw it into a storage container and then figure out what to do with it later isn’t a good methodology for IT projects.
Full disclosure: My brain is still wired to a Relational database / structured data world and I still think of IT and data in those terms. And some examples will be using Oracle / Relational database technology. Oracle still has large market demand, and the problem solving is interesting. After learning some business use cases for NoSQL databases / unstructured data, the world of unstructured data has opened up to me.