Wednesday, January 23, 2019
Thoughts on the Kaggle Analytics Competition Format
Last week marked the end of the NFL Punt Analytics Competition, which was the first "analytics" competition hosted by the popular machine learning prediction competition website Kaggle.com. Unlike the typical Kaggle competition that pits data scientists against one another to optimize an objective scoring metric on a well-defined predictive modeling task, the NFL competition required users to create data analysis reports and presentations with rule change suggestions to reduce concussions on punts. The competition was of particular interest to me, because a judged competition opens the door for those with a mix of data science skills, creativity, domain knowledge and storytelling ability to compete with the teams of supercomputer-wielding PhDs that typically dominate prediction competitions.
In my last post, I transcribed my submission to the competition, in which I focused on creating a unique rule, the "Return Safe Zone", to differentiate my project from the rest of the pack. My logic was that since the competition had four winners, the judges might start by picking a couple of the most impressive in-depth analyses first even if they had "boring" rule suggestions, but that they might choose at least one winner with an outside-the-box solution.
When the winners were announced, I was a bit surprised because some of the best projects I had seen were overlooked and a couple of the winners had somewhat lackluster reports. There were also not any particularly interesting rule suggestions: the two nicest kernels that won both ultimately suggested fair catch bonus yardage as the key way to reduce concussions. It struck me that since the competition consisted of both a data analysis kernel and a presentation, the judges may have put more weight on the summary slides than the data analysis itself. It would actually make a lot of sense because flipping through a summary is a lot quicker than reading an entire report and the presentations are ultimately what general audiences will see during the presentation event before the Superbowl. Without an objective scoring metric or feedback from the judges, it is difficult to know exactly why certain projects were picked over others. The end result was what seemed like rather disappointing end to the competition for many of the competitors. It never feels good to lose, but it feels worse to lose without any real explanation for why you lost.
For my part, despite not being selected as a winner, I enjoyed the competition more than any of the prediction competitions I've done because it allowed for more creativity and learning. Although some users decried the relatively low public activity in the competition (most competitors kept their projects hidden until just before the deadline), for me it was refreshing to have a competition without a thousand entrants just copying and pasting someone else's work. Still, there were some hiccups along the way that I think could be improved for future analytics competitions.
Kaggle should strive to provide as much information about the judging process and judging criteria as possible. Without an objective scoring metric and a leaderboard that gives you a sense of where you stand before a competition ends, judged competitions are much more prone to leaving users disappointed when winners are announced than prediction competitions. Some disappointment is inevitable, but clearly stating who the judges are, how the judging will take place and what parts of the projects are being weighed most heavily could alleviate some of this feeling.
In the NFL competition, the judges were unknown. When a user asked who is evaluating the submissions, Kaggle gave an unhelpful response: "The audience is a composed of people that include data scientists, sports officials, business people, and every NFL fan in the world (no pressure)." This tells us nothing about who actually looked through the projects and chose the winners. I can understand not wanting to name names, but saying something at least somewhat specific like: "one Kaggle employee and one NFL employee will look at each presentation to narrow the field down to 10 candidates and then they will then read each of those 10 kernels in their entirety to choose the 4 winners" would have alleviated many of the problems with this competition.
It would be helpful to give users who do not win some sort of feedback on their submissions. This could be as simple as one sentence stating what was good about the project and one sentence stating what was lacking and/or why the project was not picked. Admittedly it takes a lot of time to go through an entire project--perhaps too much time for judges to read through every single project start to finish--but you don't need to read through an entire project to get a sense of what it is about and come up with one or two things to say about it.
I also think it was a mistake to have a project with multiple parts, some of which were public and some if which were not. A project with multiple components automatically creates problems because users don't know where to focus their time and having certain parts hidden from the public means users can't see exactly what the judges saw to arrive at their decisions. Making all project components public at the end of the competition could alleviate this issue.
I hope Kaggle continues to iterate on the analytics completion format because I think it has potential to breathe new life into the platform, attracting a broader range of users beyond those interested primarily in predictive modeling. To do that it needs to leave users feeling like they produced something worthwhile even if they didn't win and to give them a sense of where they could improve.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.