Comparison of xGoals at #WM2018

In this article we compare expected Goals (in short xGoals), as they were published from the following three Twitter sites:

We will not explain the term of xGoals, nor will we explain the different methods, which are described here:

For @GoalCharts we have added a 0.75xG for every penalty. This comparison will not be a scientific study, there were too few data available for a profound statistical analysis. We aim to visualize the xG-data in such a way, that the interpretation can be done each for yourself.

The aim is to compare the different numbers (xG) of the above mentioned sites, as well to compare how the teams at #WM2018 have performed, in terms of goals and xGoals, as it is shown in the graphic on the right. It shows the scored goals and corresponding xGoals for each site as red dots and bars, respectively.

The teams are ordered by the mean of xGoals, printed as small numbers next to the team hashtags. The number of integrated goals (122) over 48 games can be compared to the integrated xGoals for the three sites. Immediately, it is observed, that @Caley_Graphics and @11tegen11 are very similar to actual goals and @StrataBet lies noticeably above.

Therefore, the questions is: How well describe the data the real outcome of the game? And might it be possible to decide, whether the outcome is more luck or skill?


Correlation of xGoals to goals

Let us first see, how well do the data describe the outcome of the game. This will be done by a correlation analysis, with the reasonable assumption that the xGoal models represent linear models. The outcome of the game is given by the goal difference, which we plot against the corresponding xGoal difference.

The graphs for the linear fit models $$ Gd = G_0 + \alpha\cdot xGd $$ are shown as straight lines and the corresponding values for the interception $G_0$ and gradient $\alpha$, together with the $R^2$ values are given in the legend. Without going into much detail, all fits are significant, the error for the gradient $\alpha$ are in the same range (0.16, 0.17, 0.14). The interception $G_0$ are consistent with zero for all fits.

The fits for @11tegen11 and @Caley_graphics are very similar and within statistical error more or less identical and differ from @StrataBet. This was already expected by the different number of integrated xGoals. The former xGoal models are probably aimed to describe the correct number of goals which results from shots. The later one describe more chances in a game, which result not neccessarly in a shot. Therefore, the number of xGoals is higher.

To be more specific for an assessment of the best model for a game outcome, we need some more data. In summary so far, it is quite reasonable to compare the goals to the xGoals. For this purpose we chose in the following for the team xGoal comparison the mean of the three sites. The conclusion are not that different, if we chose the so far best xGoal model of @Goal_charts. Unfortunately, the number of games for each team is much to low, to draw a serious conclusion concerning the team efficiency (goals vs xGoals). Anyway, let us see how well they have done!


Offensive goal efficiency

To see how well the teams performed in scoring goals, we subtract xGoals from goals and plot them in descending order. On top of the list is the most efficient team ranked, on bottom the worst efficient one.


Yes, indeed, due to the small numbers you can as well say, the most lucky or most unlucky team.


Without any doubt, the most remarkable ranking is the last place of Germany with a large gap to the second last team Island. At least for Germany, one might tend to say, that it is not only missing luck!


Defensive ranking in xGoals against a team

Next we look into the defensive strength of the teams and see how much xGoals they have admitted. On top of the list is Uruguay as the only team without any goal against. Eight out of the first ten teams have qualified for R16, and eight out of the last ten teams have left.


Defensive goal efficiency


The defensive efficiency is measured as the difference of goals and xGoals against the teams. The ranking in ascending order is shown in the graphic on the right. Interestingly, some big teams (BRA, BEL, GER, ENG) are ranked in the mid and many small teams (KOR, DEN, PER, IRN, SWE) are ranked in the upper part.
In case of the big loser of the tournament Germany, one might say that the failing was mainly caused by the offensive part of the team, whatever this in particular means.


Round of last 16

The chart on the right shows for all teams, which are qualified for quarter final, the xGoal differences vs goal differences. Due to lack of data from @GoalCharts, we use only those from @11tegen11 and @Caley_graphics, which we furthermore average. Every national flag marks a game for the corresponding team, therefore every flag appears exactly four times. Note, one flag for Belgium is more or less completely below the flag of Sweden, which means for this game the xG-differences are more or less the same.

Therefore, the right upper green area shows those games, which positive xGoal difference and goal difference, which we call deserved victory. Only Belgium has deservedly won all games and Russia is the most lucky team, as it is immediately observed in the chart.


Conclusion

After tournament we try to give a conclusion.