Saturday, July 12, 2014

John Hopkins Coursera Data Science Specialization Track--Part 3



The first run of John Hopkins Science Specialization on Coursera is drawing to a close. The final 3 courses of the 9 course series are just wrapping up, so it’s time for another batch of reviews.



Regression Models

Regression Models is the 7th course in the John Hopkins data science specialization track on Coursera. This course is essentially identical to the statistical inference course in terms of structure, presentation and quality: the entire course consists of dull, information-packed slides with mediocre voice-overs. It seems like half of the course consists of slides with verbose math expressions in summation notation and the instructor telling you don't really need to understand them unless you are interested in the math behind the models. As with other courses in the track, there are no in-lecture quizzes or interactive exercises and there is no instructor face time.

Overall this is a disappointing course that probably won’t keep your interest long enough for you to bother completing all the videos much less the quizzes and the project.

John Hopkins did release an interactive learning package for R called Swirl that provides a series of exercises for this course and some of their other Coursera offerings a few weeks after this course launched. The exercises in Swirl aren't the best around but they do help you understand the material a bit better than the main lecture content.

I give this course 2 out of 5 stars: Bad.



Practical Machine Learning

Practical machine learning is the 8th course in the 9-part data science specialization. It introduces machine learning in R, including the basics of prediction, splitting data into training and testing sets, regression, trees, random forests and boosting all in the span of 4 weeks. The course focuses on using the Caret package in R to apply machine learning algorithms.

Similar to other courses in the data science specialization, the course content is mainly static slides with voice-overs, but thankfully the slides are generally not overly cluttered and the voice-overs are of decent quality. The course has a lot of good information on how to use R to apply common machine learning techniques to data, but you aren't going to gain a deep understanding of how the machine learning methods work. "Practical" in this case means "learn how to use the tool, not how it works." I suspect students coming into this course with no prior knowledge of machine learning will find that the lectures jump from one topic to another too quickly as the course goes on. Taking a course that covers machine learning theory, like the 3 part machine learning series from Udacity, will give you a deeper understanding of the methods introduced in this course.

Practical machine learning does pretty good job introducing a machine learning topics in a limited amount of time, but the coverage is too brief to gain a solid understanding of many of the methods presented. This course would have been much better if it was 8 weeks and had at least 1 hour of solid lecture content per week with interactive exercises or homework. If you’re looking for an excellent practical machine learning course that spends enough time on each topic and has enough homework to really help students learn, check out MIT's Analytics Edge on edX.

I give this course 3 out of 5 stars: Satisfactory.



Developing Data Products

Developing data products is the final course in the 9-part data science specialization. The course introduces several tools you can use to put R code on the web, into slideshows and into R packages, including Shiny, rcharts, Google Vis, slidify and R studio presenter. Although the course is listed as 4 weeks it only has 3 weeks of lecture content, with one week devoted to giving students time to work on the course project. Unlike previous courses in the data specialization, this course is not taught by a single professor: each of the 3 professors involved in the data science specialization leads a few lectures.

This course provides a decent overview of some useful tools for integrating R with the web and in presentations, but it covers too many different tools in too short a time without any exercises to help students practice using the tools presented. You'll have to spend a lot of time on your own exploring the tools discussed to really learn how to use them. It's nice to be aware of the kinds of tools that are out there and have some basic information on each one to get started, but in keeping with the theme of the entire data science specialization, coverage is only skin deep.

I give this course 2 out of 5 stars: Bad.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.