Tuesday, December 6, 2016

Coursera: Introduction to Data Science in Python Review


Introduction to Data Science in Python is a 4-week programming course offered by the University of Michigan on the Cousera MOOC platform that teaches data manipulation with Python. It is the first course in a new 5-part Applied Data Science with Python Specialization. The course assumes you have basic Python programming skills and focuses on teaching the pandas package rather than the base language. Grading is based on one quiz at the end of week one and three programming assignments.


Intro to Data Science in Python follows the same formula most new Coursera courses, with 4 weeks of content that are all available immediately at the start of the course. Each week contains a mixture of lecture content and readings with a week end assignment or quiz. The lectures themselves explain concepts clearly, but the lecturer tends to cover a lot of different topics relatively quickly--probably too quickly for many students who are new to pandas and data science. This tendency to try to do too much too fast is exemplified by the the 4th week, which tries to introduce both probability distributions and statistical inference testing in about 20 minutes of lecture.


If you plan to complete the course, the majority of your time will be spent working on the 3 programming assignments. Coursera has integrated Jupyter notebooks with its platform, so you can work on and submit the programming assignments in your browser without having Python on your local machine. This is a great feature, since it lets you work on the course from any computer and keeps students on the same page in terms of their programming environments. The assignments themselves focus on using pandas to perform various data manipulations on provided data sets. The assignments achieve the goal of giving students practice working with data in Python, but they often require research outside of the course materials, to the point where you may end up learning more from the pandas official documentation than the course materials themselves. In addition, stingy auto-graders combined with messy data, multi-step data manipulations and sometimes confusing instructions make it easy to get answers wrong without it being obvious where you went wrong.


Intro to Data Science in Python is a good course for those looking to build data manipulation skills with the pandas library and don't mind fast-moving lectures. The programming assignments are sometimes tedious and potentially frustrating, but they do give you a lot of practice using pandas.


I give Intro to Data Science in Python 3.5 out of 5 stars: Good.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.