Tuesday, April 24, 2018

Kaggle Learn: Pandas Level 1 Review



Kaggle has recently started developing and promoting educational materials to help you jump start your data science skills. Kaggle's new "Learn" section already offers 6 tracks that aim to teach data science skills such as Python, R, SQL and deep learning using a hands-on approach: each course is delivered via a series of web-based programming notebooks that you can fork to modify and experiment with at your leisure. Kaggle Learn's Level 1 Pandas course is a series consisting of 7 notebooks aimed at teaching the basics of data manipulation with Python's Pandas library. As non-traditional online courses, the Kaggle Learn offerings don't appear to offer any sort of tracked grading system or certificates but each section of the Pandas series does include several programming exercises to work on.


The Level 1 Pandas course focuses on data manipulation tasks using dataframes and series--special data structures offered by Pandas that offer a host of functionality geared toward data science that make them more convenient to work with than standard Python data structures like lists and dictionaries. Main course topics include: reading and writing data, indexing, summary functions, grouping and sorting, data types, missing data, combining data and method chaining. Each section actually consists of 2 programming notebooks: a workbook containing a short introduction, resource links and set up code followed by a series of programming exercise and a reference notebook that contains most of the actual tutorial information you'll need to read through to complete the exercises. This 2 notebook system is a little confusing at first and you'll ideally either have 2 monitors or one screen with a lot of real estate to keep both notebooks open at the same time so that you can read and refer to the tutorial while working on the exercises.


The tutorials themselves are short and concise, with each one only tackling a handful of new concepts and functions. The information presented is generally clear although it might take a little while for you to get used to the presentation style and flow if you haven't used programming notebooks much in the past. There is no video content in the course and since the tutorials are relatively short, you'll probably spend the lion's share of your time working through the exercises. The exercise instructions are short and there is no hint system or option to view solutions, but the course does offer commands to check your solution so you'll know when you hit on the expected result.


Kaggle's Level 1 Pandas learning track is a good introduction to data manipulation in Python that is presented in a format that is more concise than standard lecture-style MOOCs and lends itself well to providing hands-on exercises. Splitting each lesson into two separate notebooks makes it easier to refer to the tutorial while working on the exercises if you have the screen real estate to accommodate them both at once, but the pacing might feel better if everything was contained in one notebook with exercises appearing as new concepts are introduced. As a course geared toward beginners, it would be nice to have the option the view exercise especially considering the instructions aren't always as precise as they could be and the answer checkers don't provide much feedback on where things went wrong.


I give Kaggle Learn's Level 1 Pandas series 3.5 out of 5 stars: Good.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.