Monday, April 21, 2014

Udacity Exploratory Data Analysis Review


Exploratory data analysis is the third course released as a part of Udacity's new Data science focus area that launched at the beginning of 2014. The course provides an overview of using R to explore data and focuses heavily on the use of the ggplot2 package in R to create data visualizations. Although the course touches briefly on high-level theory and concepts like summary statistics, transforming data, correlation and linear regression, almost all of the quizzes and homework questions have to do with creating plots and making observations based on plots. This is not necessarily a bad thing--learning to plot in R is a valuable skill and an important part of exploratory data analysis--but it seems like the course should have spent a bit more time covering high-level concepts and numeric methods for exploring data like using tables and summaries. Despite that quibble, this is good course with a lot of high quality and practical content. It moves slowly enough for you to get comfortable with basic potting syntax before building up to more complex visualizations, but fast enough to keep you engaged.

Be aware that the course mainly uses two data sets to teach the material: a data set of diamond prices and characteristics and set of pseudo Facebook data created by the instructors meant to mirror real Facebook data, such as friend counts, tenure on the site, user age and gender. Your enjoyment of the class will depend, in part, on your interest in the data.

I give the course 4.5 out of 5 stars: Great.





























No comments:

Post a Comment

Note: Only a member of this blog may post a comment.