Thursday, June 15, 2017

edX: Python for Data Science Review


Python for Data Science is the first course in a new data science MicroMasters program offered by UC Sand Diego on edX. The course covers the basics of Python for data-related tasks at a pace suitable for beginners. You do need to know the basics of programming to take the course, but week 2 provides an optional review of Python so you don't necessarily need to know Python coming into the course. The course is self-paced but the course page lists the estimated effort needed to complete it at 8 to 10 hours per week for 10 weeks. This is probably an overestimate if you have some prior experience with data science, but it isn't a short course. Grading is based on periodic engagement checks interspersed throughout the course materials, 6 quizzes, 2 projects and a final exam. The quizzes allow unlimited attempts but the final does not.


Although the course is listed as running 10 weeks, the core lecture content spans 6 weeks. Week 2 is mostly optional background material, while weeks 6, 9 and 10 are devoted the projects. Main course topics include: the data science process, Jupyter notebooks and numpy, pandas, data visualization,
machine learning basics and working with text and databases. Each week has roughly 1.5 to 2 hours of lecture content along with programming notebooks you can download to follow along. The lectures are high quality, both in production value and instruction, and take adequate time discussing each topic, so beginners shouldn't feel too rushed taking this course as can sometimes be the case with other introductory courses. The course avoid the pitfall of trying to cover too many different algorithms and techniques, instead focusing on the most important fundamentals. Although the engagement checks and quizzes essentially amount to "free points", they provide the sort of positive reinforcement that is important for beginners to avoid frustration and gain confidence when learning a new skill. The projects involve selecting data sets to work with and using the skills learned in lecture to explore and analyze them. The projects are peer graded--a necessary evil with open-ended MOOC assignments--but they only account for 30% of your final grade so they won't be make or break in terms of passing the course.


Python for Data Science is a great first course for students new to data science who have a bit of prior experience with programming. A motivated student could probably learn programming contemporaneously with taking this course by going through a few intro lessons on code academy or taking the introductory Python programming class on Udacity. If you already have some knowledge of data science, this course might move too slow for you, but you can always increase the video playback speed or jump around to topics that interest you.


I give Python for Data Science 4.5 out of 5 stars: Great.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.