Ranking Tutorial: The Strength Rating

Priors

In a game analysis section, I showed how likely one is to get a particular score for a game between two opponents whose location-adjusted ranking difference is dr. It seems we're almost there -- just calculate the G(sa,sb) values for all games, and run with it! Not quite. Bayes' theorem indeed allows for the inversion of P(sa,sb|dr) to P(dr|sa,sb), but there is an added term. Expanding dr to equal the difference between the two team ratings (a and b) plus the home field factor (h, assuming A is the home team), the correct equation for the inversion is:

   P(a,b,h|sa,sb) = P(sa,sb|a,b,h) P(a) P(b) P(h).

P(a), P(b), and P(h) are the probabilities of team A being ranked a, team B being ranked b, and the home field factor being h, in the absence of all data.

"But wait", you ask, "aren't computer ratings supposed to be unbiased? How can you justify built-in prejudices to the rankings?" Well, I justify it because Bayes' theorem demands it. If no team in the history of sport has ever achieved a ranking of 1000 sigma above the league average, the odds of one doing it now are quite slim. The trick is to define the prior in such a way as to not bias the ranking in favor of or against any team.

The way I address this problem is to use the same prior for all teams. (In college football, I use different priors for I-A, I-AA, and so on, but still rank all teams within I-A using the same prior.) This obeys the requirement that a prior be used, while not rating Ohio State better than Northern Illinois merely on the basis that Ohio State has historically been a better team. To calculate the prior mean and width, I first calculate rankings with no prior, estimate the mean and inherent spread in the rankings (equal to the standard deviation minus the uncertainties in quadrature), make that the prior, and recompute the rankings. If the mean is m and the standard deviation is d, the prior P(a) equals:

   P(a) = NP(-0.5*((a-m)/d)^2)

The prior for team B is the same (replacing a with b), and the prior for h is determined from many seasons of data.

The Strength Rating

OK, now we have all the pieces. The probability of the entire season being produced given a set of team ratings equals the product of the probabilities of each game and all the priors. In other words:

   P(r1,r2,...rn) = prod(i=games) P(sai,sbi|rai,rbi,h) * prod(i=teams) P(ri) * P(h)

where ri designates the rating of team i, sai and sbi are the scores of the home and road teams in game i, rai and rbi are the ratings of the home and road teams in game i, and h is the home factor.

Most rating systems stop here, compute the maximum likelihood solution for the ratings and homefield factor, and call it a ranking. It's possible to do much better, however. Recall that all of the probabilities are NP(x) functions, which can be trivially multiplied as NP(x)*NP(y)*NP(z) = NP(x+y+z). This means that we can rewrite the above equation as:

   -2 lnP = sum(i=games) (rai-rbi+h-G(sai,sbi))^2 + sum(i=teams) (ri-m)^2/d^2 + (h-hm)^2/dm^2

where hm and dm are the mean and width of the prior for the home field factor h.

Multiplying all of this out, one finds that -2 lnP is a second-order polynomial and can be written as:

   -2 lnP = C + [ sum(i=teams) sum(j=teams) Mij ri rj ] + [ sum(i=teams) Vi ri ]

where C is a meaningless constant, and Mij and Vi are the polynomial coefficients. From here, you can make successive integrations to marginalize each team's rating until only the team you are trying to rate remains. For example, if you want to marginalize team k, you would rewrite the above as:

   -2 lnP = C + D + Mkk rk^2 + [Vk + sum(i!=k) Mik ri] rk

where D contains the sums of all terms not containing k. This can be rewritten as:

   -2 lnP = C + D - 0.25*[Vk + sum(i!=k) Mik ri]^2/Mkk + Mkk (rk + 0.5*[Vk + sum(i!=k) Mik ri]/Mkk)^2

Integrating P over rk reduces the last term to a constant, and we have eliminated team k from the integral. Repeating this process for all teams other than the one you are trying to rank results in:

   -2 lnP = A r^2 + B r + C

where C is different from the constant before (but equally insignficant). This is, of course:

   P = NP( (r-B/2A) /sqrt(A) )

which is a Gaussian distribution centered on B/2A (the team's rating) with width 1/sqrt(A) (the uncertainty in that rating).

Repeating this process for all teams in the rating, one can arrive at a statistically-accurate rating, including uncertainty, that accounts for all interdependencies between the individual team ratings.

This strength rating is not shown anywhere on my ranking pages, but underlies the standard, median likelihood, and predictive rankings.

Return to ratings main page

Note: if you use any of the facts, equations, or mathematical principles on this page, you must give me credit.