Monday, April 22, 2019

2019 Kaggle Career Con Day 3 Recap



The third and final day of the second annual data science career convention on Kaggle.com concluded today. The following is a recap and summary of the sessions and my main takeaways and criticisms. Note that I was not able to attend all of day 3 live, so certain parts I had to skim over before the live video feed went dark.



Session #1: Coding Workshop Part III: How to deploy your API on Heroku or Google Cloud.

The final part of the 3-part live coding sessions about how to deploy a Python ML model to make a working API on the cloud. I was not able to attend this session, but these coding sessions didn't have much to do with the job search process.


Session #2: A Behind the Scenes Look at Data Science Interviews with a Hiring Manager

This session was a presentation given by Data Scientist manager at Quora William Chen. It was full of information-packed slides and unfortunately the videos and slides are not available now, so I'll just list the few key points that I managed to commit to my notes.

His general tips for data science: 1. Start with something simple. (Did you try logistic regression?) 2. Always try to think quantitatively and be prepared to justify statements, recommendations and conclusions based on data. 3. Focus on the why. 4. Lead the discussion; proactively communicate.

For data science interviews and technical skills, take home challenges are becoming increasingly common. Testing of SQL knowledge is also common, especially for data analyst/analytics roles and he feels SQL-realted questions are among the easiest to study for because there aren't that many different design patterns in SQL. It is also valuable to know about experiment design principles for data analytics positions.

This was a good session with a ton of information--to much to really capture it all on a single viewing--so it is a shame Kaggle decided not to make the content available right away after the fact.


Session #3: Real Stories from a Panel of Successful Career Switchers

I was not able to catch this live. From my review of the video, it mostly focused on the backstories of the career switchers. The upshot is that you can come into data science from a diverse range of fields and different roles within data science require a diverse range of skills. For those getting started, how you think is often more important than what you know.


Session #4: So You Want to Become a Data Scientist? A Crash Course for Non-Engineers

This session was a lecture and Q and A given by Gary Kazantsev, Head of Machine Learning Engineering at Bloomberg. He believes there are 3 question to ask yourself when figuring out what type of data science role suits you:

How much domain expertise do you have?
How much software engineering skill do you have?
How much math background do you have?

If you are from a non-engineering background, your role is probably going to be a jack-of-all trades. Basically, he feels data scientists are dived into two groups: those who are expected to build production apparitions (ML engineers) versus those who do not (data scientists).

To be a data scientist you should be a scientist first. This means you should be familiar with the scientific method, know the fundamentals of statistics such hypothesis testing, statistical significance, what kind of metrics you can compute, how many measurements are needed to get certain confidence interval, etc.



Session #5: Digital Career & Education Fair (on CareerCon's Slack)

This session provided an opportunity to reach out to hiring managers at various companies participating in Career Con on the CareerCon Slack channel. This session did not feel particularly productive because it mainly took the form of users spamming chat channels with their information/resumes while hiring managers generally just refereed users to apply to open positions via hyperlinks which were available on the Career Con web page throughout the entire week.



Session #6: What are Hiring Managers Really Looking For?

The final session was an interview with Ruben Kogel, Head of Data Analytics at Lime. This interview was much more focused on the data analytics than engineering and machine learning, so his advice is probably more applicable to those looking for data analyst roles than data scientist/ML engineer roles.

According to him, students tend to fixate too much on advanced statistical/ML techniques, when in reality you can solve 80-90% of problems with simple counts and statistics.

What do data analysts need to succeed?

SQL + knowledge of a data manipulation language. Basic statistical techniques and the scientific method along with high attention to detail and product/business sense. Causal inference, experimental design and basic regression as well as the ability to determine what's causing things you are seeing and the relative importance of those driving factors.

Soft skills are very important to data analysts such as communication, persistence, self-driven and the ability to be an objective investigator. It is important to communicate well, establish relationships and provide useful recommendations to build credibility.

The interview process usually involves an initial phone screen and a technical screen to ensure that you meet some minimum bar of technical skills followed by in person interviews to assess everything else.

If you don't have a lot of experience, it is generally better to go to a big company that has other data workers you can learn from and a support system in place to allow junior data workers up to speed. Startups generally can't afford to bring in very junior people. Ultimately getting experience is the most important thing.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.