Friday, June 8, 2018

edX: HarvardX Data Science: Inference and Modeling Review


Data Science: Inference and Modeling is the 4th course in the 9-course data science certificate program offered by Harvard on the edX platform. The course spans 7 sections that cover topics related to statistical inference, including polling, confidence intervals and Bayes' Rule, using election polling and results of the 2016 election as a main motivating example. The course builds on topics and programming constructs covered in previous courses in the track, so you should either take the lead up courses or have a basic understanding of probability and familiarity with R including the dplyr and ggplot2 packages before taking this course. Grading is based 8 programming assignments administered through DataCamp, an external learning platform with a robust interactive programming environment.


Data Science: Inference and Modeling follows the same pattern as previous course in the Harvard Data Science track, separating topics into relatively small and manageable sections consisting of a few short lecture videos followed by programming exercises that lets you get hands-on practice with the concepts. Despite the course having 7 sections, the listed time to completion on the course home page 4 weeks, so you're expected to complete about 2 sections per week. The video lectures are high quality and the instruction is generally clear and concise although certain topics could be explained in greater detail. The course touches on many important concepts like p-values, the t-distribution, Bayesian statistics and  chi-squared tests, but many of these topics are given only a few minutes of attention in the lectures.


If you plan to complete the course, you will likely spend the great majority of your time on the programming assignments. DataCamp has a nice interface that puts everything you need to compete each exercise in one interactive window. Many of the exercises involve several steps that each need to be completed precisely in order to arrive at a correct result; the instructions generally do a good job specifying the steps you need to take to arrive at the expected result, but there are occasional issues with problem instructions and grading that may cause you to get wrong answers. You will occasionally have to use formulas that are not always shown in the instructions, which will likely leave you sifting through lecture videos to find them. Thankfully, you can always look at hints or full solutions if you get stuck, although doing so will cost you some points.


Data Science: Inference and Modeling provides a good introduction to a variety of topics in statistics reinforced by substantial programming exercises with interesting motivating examples from the 2016 election, but many topics could stand to be given more coverage.


I give Data Science: Inference and Modeling 3.5 out of 5 stars: Good.




No comments:

Post a Comment

Note: Only a member of this blog may post a comment.