Can predictive analytics reveal the winner of the World Football Championship?

17 May 2018

Football fans from around the world are eagerly awaiting the kick-off of the sport’s biggest competition. But rather than just sit around and wait, Bisnode has tasked its data and analytics experts to develop an algorithm capable of predicting which national team will emerge as the winner on July 15, 2018.

Bisnode Group Analytics has joined forces with local analytics teams to see if data science and machine learning can be used to predict which national team will win football’s biggest competition. By leveraging all historical data about national team games over the past 4 years, Bisnode has developed a model that estimates the probability of a win, draw or loss, as well as goal difference for any future game between the national teams involved in the competition.

“Our first objective was to develop a model able to predict a win, draw or loss, as well as goal difference for any future game between two national teams based on their characteristics,” says Gauthier Doquire, lead Data Scientist on the project. “The second objective was to determine the most likely scenario for the competition and other derived statistics by running large-scale simulations, considering the specifics of the tournament.”

The first model was based on historical data with information on type of tournament, the location of the game, the score, ranking of teams, and so on. These data were fed into an advanced machine learning model using a technique known as eXtreme Gradient Boosting to calculate probable outcomes for each game. Using this predictive model, Bisnode Group Analytics ran simulations for the actual games that will take place.

“By generating millions of simulations, we derived probabilistic information associated to any team reaching any stage of the tournament,” says Goran Loncar, Director of Group Analytics. “Our approach not only allowed us to estimate the probability of a team reaching a certain stage, but it also provided the most likely scenario for the entire tournament, as well as the overall chance for each team to lift the cup.” So, how did we do? All predictions will be available online at, so you can see for yourself.

BISNODE USED MACHINE LEARNING, the subfield of computer science that “gives computers the ability to learn without being explicitly programmed”. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning uses algorithms that can learn from and make predictions based on data.

BISNODE USED DATA MINING, an analysis technique that focuses on detecting patterns and trends in data, to feed the predictive model and rank the teams.

Read more about how we made the predictions and find out who the winner will be >