W = exp(-days/c)
To calculate a team's predictive rating, I make two calculations, one using the weights described in the previous paragraph and one using a constant weight for all teams. The probabilities pertinent to a team's games, from the strength rating page, can be written as:
-2 lnP = sum(i=games) Wi (r-oi-G(sai,sbi))^2/(1+doi^2) + (r-m)^2/d^2
As before, this simplifies into a simple Gaussian of the team's rating r. Redefining Wi to include the 1/(1+doi^2) term, defining ri as oi+G, and defining PW to equal 1/d^2, this becomes:
-2 lnP = sum(i=games) Wi (r-ri)^2 + PW (r-m)^2 r = [ PW m + sum(i=games) Wi ri ] / [ PW + sum(i=games) Wi ]
I should note that this is simplified from the full equation used for the strength ratings, and is slightly less accurate. This is why I calculate r using the weights and using constant weights. The team's predictive rating is set to equal the team's strength rating plus "weighted r" minus the "unweighted r". Thus, for baseball (where the weights are constant), the predictive rating equals the strength rating.
The predictive rating calculates the odds that one team will beat another in a future game. A second piece of useful data is whether a team tends to play in high-scoring or low-scoring games. This is the P-SCR rating.
It might be tempting to calculate an offense and defense rating separately, but combining this with the predictive ratings gives three ratings for a team when there should be two. Instead, I use the predictive rating and the score rating to calculate the expected total score and score difference.
The basis of my score rating is the calculation of the "typical score" of two opponents, where "typical score" is defined such that:
m1 = sqrt(S) + x*sqrt(m1) and m2 = sqrt(S) - x*sqrt(m2),
Solving for x and requiring that it be the same in the two equations, one finds:
S = m1*m2.
The second piece of data comes from the predictive ratings, which are defined such that:
dr = (m1-m2)/sqrt(m1+m2),
The other loose end is how "S" is calculated from the team's scoring ratings. It should be defined such that two teams that tend to have average-scoring games will be predicted to have an average-scoring game. It should also be defined such that a team that tends to be in games with twice the average score playing a team that tends to be in games of the average score will be predicted to have twice the average score. Finally, regardless of how low the teams tend to score, S can never be negative. These constraints are satisfied by:
S = s1*s2,
where s1 and s2 are the teams' scoring ratings.
Working through a little bit of math, one finds:
m1+m2 = 0.5*dr^2+0.5*sqrt(dr^4+16*s1*s2) m1-m2 = dr * sqrt [ 0.5*dr^2+0.5*sqrt(dr^4+16*s1*s2) ]
m1 = m2 = s1 = s2
To calculate the teams' scoring ratings, one must maximize the likelihood of the outcome of all games that season. As noted above, the odds that a team predicted to score "m1" times will score "n1" times AND that a team predicted to score "m2" times will score "n2" times is given by:
P(n1,n2|m1,m2) = exp(-m1-m2) * m1^n1 * m2^n2 / n1! / n2!
0 = sum(i=games) (n1/m1-1) dm1/ds + (n2/m2-1) dm2/ds.
A note must be made in regards to overtime games. Naturally overtime creates higher scores that do not reflect the number of points a particular team tends to score in its games. In sports with sudden-death overtime, one can use the knowledge that both teams had the lower score after regulation and replace (n1+n2) accordingly. In sports with constant-length overtime periods, it is fair to scale the combined score (n1+n2) to reflect the extra time played. In college football, I simply ignore scores of overtime games because of the fundamental difference in the way the game is played.
You may be wondering why I don't just calculate the score ratings at the same time as the predictive ratings. The main reason is that, while win/loss and G values are distributed according to the expected statistical distribution, there is more scatter than expected in the total scores. In other words, my statistical model works perfectly in calculating team predictive ratings and the odds of team A beating team B, but does not work perfectly in predicting expected total scores of future games. Thus I do what can be done perfectly first and follow with the imperfect so as to not hinder the accuracy of the predictive ratings.
Note: if you use any of the facts, equations, or mathematical principles on this page, you must give me credit.