Thursday, December 8, 2016

edX: Using Python for Research Review



Using Python for Research is a 4-week programming course offered by Harvard on the edx platform that teaches the basics of using Python for data analysis. The course is geared toward Python novices who have already taken a beginner course on Python, although you could probably get through it without prior knowledge of Python if you are experienced in another language. The first week reviews Python basics, the second covers data-oriented functionality like the numpy and matplotlib libraries while the final two weeks consist of 6 case studies that introduce various topics like language processing, classification and social network analysis. Grading is based comprehension questions that appear after every lecture video and 8 programming assignments.


The lecture content in Python for Research is high quality and it covers topics at about the right pace for the target audience. It also doesn't skimp on videos as some newer courses do--each week has at least 3 topic sections and most sections have at least 4 lecture videos with comprehension questions following each one. The organization of topics gets a little muddled in the final two weeks because each section is a case study that may or may not be related to the sections before it, but that also means you can skip around to some extent. The course focuses on nuts and bolts and dealing with data using base Python, numpy and pandas, often building functions to process data from scratch. For instance, the lectures walk through building a basic k-nearest neighbor classifier.


Programming assignments are administered through DataCamp, a third party web data science learning platform. DataCamp has its own interactive web-based programming environment, so you can do all of the assignments in your browser without downloading anything. Each assignment is broken up in to a series of short tasks, that amount to implementing small functions and operations based on a paragraph on instructions in the side bar. The hardest part of the assignments is understanding what the instructions are asking and getting your code to return the exact result the grader expects. If you try to get fancy or do things your own way, you're likely to run into errors. If you get stuck, you can click a button to view hints and view the full answer. This is a nice feature to keep from getting stuck and wasting too much time trying to track down errors. It can be annoying, however, not to have all of your code for the assignment in a single document as it makes it harder to get a big picture understand of the code. DataCamp works well for relatively simple, rote exercises like those in the beginning of the course, but its shortcomings are increasingly noticeable the longer and more complex the assignments become. 

Using Python for Research is a good introduction to data analysis in Python that moves at the right pace for Python novices. Experienced programmers don't know Python but want to learn it for its data crunching prowess could also benefit from this course. 


I give Using Python for Research 4.25 out of 5 stars: Very Good.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.