Sunday, February 15, 2015

Udacity - Model Building and Validation Review



Model Building and Validation is an advanced data science course provided by AT&T through the Udacity MOOC platform. The course is listed as "advanced" because it assumes prior knowledge of machine learning, statistics, linear algebra and calculus. Despite the stated prerequisites, math doesn't play a large role, so you will still be able to understand most of the content even if your only preparation is Udacity's intro to machine learning. The course spans 4 lessons that detail the process of extracting value from data through questioning, modeling and validation. Lesson 1 is a general introduction to the QMV process with each of the following lessons digging into each component of QMV in more detail. The course somewhat oversells its length as none of the lessons take more than a few hours despite the course being listed at an estimated 8 weeks with 6 hours of study per week. Admittedly, I did not do the final project that involves creating a fraud detection model, which could take a significant chunk of time.


Model Building and Validation follows the same formula as other Udacity courses, with each lesson taking the form of a series of short lecture videos interspersed with quizzes. The lecturers are easy to understand and the video quality is generally good, although the videos and course materials have some glitches that need to be ironed out. I won't grade the course too harshly on bugs, since all courses are buggy at the very beginning, and they will likely be fixed in the near future.


As for the content itself, the simple idea of framing a data analysis as a tree to track and organize the decisions you make along the way is probably the most useful thing you'll take away from this course. The course also does a good job getting students to think about some of the high-level decisions that must be made when conducting a data analysis. The content gets rockier when it delves into specifics after lesson 1, particularly in the models lesson. The lectures occasionally dive too quickly into the low level details of machine learning techniques that students may not have seen before. Additionally the validation section focuses much more on model evaluation metrics like ROC curves, the confusion matrix and derived metrics that fall out of it, than validation itself.


Model Building and Validation is a good course that provides a nice framework for approaching data analysis, but it gets bogged down in some machine learning specifics that don't add much to the overarching theme.

I give Model Building and Validation 3.5 out of 5 stars: Good.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.