CCNY has been working on giving you our predictions for this weekend’s Oscars. This, we thought, would be a fun data set to put into a magic predictive statistical whatcha-ma-callit.
After looking at our results, the best we could do was give the playing field 50-52% chance of winning their category with the most likely winner receiving a slightly better, 61-68% chance of winning.
This wasn’t making our data geeks happy. Why can’t we give better predictions? What is it about this data set that makes it so hard to give you accurate predictions with better odds?
The Oscars are so hard to predict because the sample changes every year, not only are the nominees different but the composition of the Academy (the voters) is different as well. When we use the historical data, as well as the data from other awards, show voting bodies it makes narrowing the information down to a point where the predictions give a precise and accurate front runner almost impossible. To get the right sample, we would have to survey the members of this years’ Academy ONLY.
What does this mean for you? This means SAMPLES and data sets are important. When you are thinking of taking a sample to make predictions, it is essential that you are asking for the right information from the right sources.
Let’s see how well we do with our predictions for this year for the top acting categories, knowing full well that the data geeks are mad they can’t give you better numbers.