Data and Analytics: Competing to learn vs. competing to win
My daughter recently started competing in cross-country for her school. It is a very different sport from those I competed in - football and baseball - and is also distinct from the sports my other children have participated in, like volleyball and basketball. While cross-country is very much a team sport, you really compete only against yourself and the course. So far, my daughter hasn't won any races, and right now just aims to finish in the top half, but she gets better and stronger, and strives to set a personal record, with every race. And in every race, she understands more about how to judge a hill or navigate a creek bed. Yanni is learning how to push herself as a competitor, even when her lungs and legs yell at her to just walk.
Competing to learn
As I watched her compete at a big cross-country meet the other day, it occurred to me that we're doing something similar with the Cortana Intelligence Competitions. Yes, there is one winner, with a second and third place as well, but for the majority of people who enter, that isn't what these competitions are about. They are about learning how to become better at thinking about data and the predictive scientific method.
When you start the competition, just going through the process of getting a submission accepted seems like a big deal, similar to the feeling my daughter had of just finishing her first race. If you haven't done something before, it can seem scary or daunting. But once you get through the first experience, and it's not as bad as you thought it would be, your anxiety lessens, and may even vanish. For the Cortana Intelligence Competitions, there is a helpful FAQ that answers the most common questions and helps you get started. Here are the basics:
- Yes, it is free
- No, you don't have to be a data scientist to enter
- Yes, there are cash prizes for winners
- Most competitions are open for 8-12 weeks
Improving your performance
If you're at all like me, when you get your first score back - regardless if it's in the bottom 10% or the top 10%, you'll wonder how you can improve. I always think I can do better. Some of the things you can immediately look at are the features in your initial model that seem to have the most weight, and you can use the "Permutation Feature Importance" model to help determine this. You should also look at your data set and create "Edit Metadata" modules to remove columns that should not be features. Then, run two Machine Learning models against each other to compare them.
I encourage you to participate in these competitions, as a way to learn. While it's nice to win, simply entering and competing will build skills that can help your customers - and you - achieve more. And I think that's definitely a win.
Data science resources
- Introduction to data science and machine learning for partners
- Microsoft Professional Degree Program - Data Science Track
- Data Science Curriculum