Thursday, July 19, 2018

edX: HarvardX Data Science: Data Wrangling Review


HarvardX Data Science: Data Wrangling is the 6th course in a 9-part data science professional certificate track offered by Harvard on the edX course platform. The course covers various topics related to data cleaning, including loading data in different formats, organizing data and dealing with strings and dates. The course does not require any particular background besides basic knowledge of R. Unlike the first four courses in the specialization, Data Wrangling does not have programming assignments. Instead, grading is based upon multiple choice questions that appear under each lecture video.


Data Wrangling is among the most important tasks in data science because essentially every data project, whether it be a simple exploratory analysis or a fully-fledged predictive modeling application, requires getting raw data into a clean form suitable for analysis. The course consists of 4 main topic sections: data importing, tidying, string processing and dates, which delve into various subtopics including reshaping data, joining tables, web scraping and regular expressions. The lecture videos are well-made and the instructor explains concepts clearly, but it tends to move from one topic to another relatively quickly without giving you much opportunity for hands on practice. The first four courses in the specialization used programming assignments administered through DataCamp to give you practice with the concepts presented in lecture, which greatly increased the value of those courses. This course has no programming assignments at all. The course does encourage you to run R in a local environment to experiment as you follow along, but multiple choice questions are not nearly as helpful as interactive programming assignments.


HarvardX Data Science: Data Wrangling is a fine course for learning about common data cleaning tasks and how to approach them using R, but the lack of programming assignments means you'll get little hands-on practice to consolidate your learning. Even conceptual, theory-heavy courses in computing and data science benefit from programming assignments; in a course on data cleaning, you'd expect the majority of your time to be spent actually cleaning data.


I give HarvardX Data Science: Data Wrangling 2.5 out of 5 stars: Disappointing.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.