Can predictive analytics reveal the winner of this year’s biggest football competition?

17 May 2018

Football fans from around the world are eagerly awaiting the kick-off of the sport’s biggest competition. But rather than just sit around and wait, Bisnode has tasked its data and analytics experts to develop an algorithm capable of predicting which national team will emerge as the winner on July 15, 2018.

Bisnode Group Analytics has joined forces with local analytics teams to see if data science and machine learning can be used to predict which national team will win football’s biggest competition. By leveraging all historical data about national team games over the past 4 years, Bisnode has developed a model that estimates the probability of a win, draw or loss, as well as goal difference for any future game between the national teams involved in the competition. 

“Our first objective was to develop a model able to predict the result of a single game, based on the teams’ characteristics,” says Pierre Deville, Head of Data Science and Analytics at Group Analytics. “The second objective was to determine the most likely scenario for the competition and other derived statistics by running large-scale simulations, considering the specifics of the tournament.”

The first model was based on historical data regarding type of tournament, the location of the game, the score, and so on. Team ratings were fed into an advanced machine learning model using a technique known as eXtreme Gradient Boosting to calculate probable outcomes for each game. Using this predictive model, Bisnode Group Analytics ran simulations for the actual games that will take place.

“By generating millions of simulations, we derived probabilistic information associated to any team reaching any stage of the tournament,” says Goran Loncar, current Director of Group Analytics. “Our approach not only allowed us to estimate the probability of a team reaching a certain stage, but it also provided the most likely scenario for the entire tournament, as well as the overall chance for each team to lift the cup.”

BISNODE USED MACHINE LEARNING, the subfield of computer science that “gives computers the ability to learn without being explicitly programmed”. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning uses algorithms that can learn from and make predictions based on data.

BISNODE USED DATA MINING, an analysis technique that focuses on detecting patterns and trends in data, to feed the predictive model and rank the teams.

Read more about how we made the predictions and find out who will win >