In sports, most notably in chess, baseball and basketball, the Elo-rating system is used to rank players. The rating is also helpful in deducing win probabilities (see my blog post Elo-Rating and Win Probability for more details on that). Suppose two players or teams with the current ratings r(1) and r(2) compete in a match. What will be their updated rating r'(1) and r'(2) after said match? Let’s do this step by step, first in general terms and then in a numerical example.

The **first step** is to compute the transformed rating for each player or team:

R(1) = 10^{r(1)/400}

R(2) = 10^{r(2)/400}

This is just to simplify the further computations. In the **second step** we calculate the expected score for each player:

E(1) = R(1) / (R(1) + R(2))

E(2) = R(2) / (R(1) + R(2))

Now we wait for the match to finish and set the actual score in the** third step**:

S(1) = 1 if player 1 wins / 0.5 if draw / 0 if player 2 wins

S(2) = 0 if player 1 wins / 0.5 if draw / 1 if player 2 wins

Now we can put it all together and in a **fourth step** find out the updated Elo-rating for each player:

r'(1) = r(1) + K * (S(1) – E(1))

r'(2) = r(2) + K * (S(2) – E(2))

What about the K that suddenly popped up? This is called the K-factor and basically a measure of how strong a match will impact the players’ ratings. If you set K too low the ratings will hardly be impacted by the matches and very stable ratings (too stable) will occur. On the other hand, if you set it too high, the ratings will fluctuate wildly according to the current performance. Different organizations use different K-factors, there’s no universally accepted value. In chess the ICC uses a value of K = 32. Other approaches can be found here.

—————————————–

*Now let’s do an example. We’ll adopt the value K = 32. Two chess players rated r(1) = 2400 and r(2) = 2000 (so player 2 is the underdog) compete in a single match. What will be the resulting rating if player 1 wins as expected? Let’s see. Here are the transformed ratings:*

*R(1) = 10 ^{2400/400 }= 1.000.000*

*R(2) = 10 ^{2000/400 }= 100.000*

*Onto the expected score for each player:*

*E(1) = 1.000.000 / (1.000.000 + 100.000) = 0.91*

*E(2) = 100.000 / (1.000.000 + 100.000) = 0.09*

*This is the actual score if player 1 wins:*

*S(1) = 1*

*S(2) = 0*

*Now we find out the updated Elo-rating:*

*r'(1) = 2400 + 32 * (1 – 0.91) = 2403*

*r'(2) = 2000 + 32 * (0 – 0.09) = 1997*

*Wow, that’s boring, the rating hardly changed. But this makes sense. By player 1 winning, both players performed according to their ratings. So no need for any significant changes.*

—————————————–

*What if player 2 won instead? Well, we don’t need to recalculate the transformed ratings and expected scores, these remain the same. However, this is now the actual score for the match:*

*S(1) = 0
*

*S(2) = 1
*

*Now onto the updated Elo-rating:*

*r'(1) = 2400 + 32 * (0 – 0.91) = 2371*

*r'(2) = 2000 + 32 * (1 – 0.09) = 2029
*

*This time the rating changed much more strongly.*

—————————————–

Thanks! Very clearly put.

This was fantastic to help me set up a rating system for table tennis between fellow classmates. Thank you!

Thanks for this wonderful tutorial but i have some questions. In a real world scenario how is r(1) and r(2) is calculated. That is from your example r(1) = 2400 and r(2) = 2000 how you came into that conclusion.

Hello Sura, I’m glad you liked the tutorial! For the demonstration the ratings r(1) and r(2) were chosen at random, but generally they are a result of all the previous matches. So to calculate the rating r(1) = 2400 of player 1, you’d have to know the starting point (initial rating before the first match) and go through the above calculation for each match that he or she has played. Hope that helps!

Is it possible to simply go back to the beginning and start everyone at even?

Thanks for the tutorial!

I have a rather silly question. What should the very first initial rating of the players be before they play any match? (not in chess but for any Elo-based rating).

Also, should there be any correlation between the K-factor and the initial rating?

I’d probably go with 2000, it’s a value of decent size and it means that changes don’t vary too much.

Some place use 1000, as it gives a good easy start point and will also mean that good and bad player separate quite quickly, so good player don’t play games that they will destroy the opposition in.

Love you man! You just made my day.

Been looking for this formula explaination for a long time!

Thanks for great tutorial and explanation.

In my game i want to use this but my game have maximum 6 players and all of them play against each other and among all of them if any one lose all lifes than rest of players should get win and The player who has lost his all life should get lose than how calculate ELO rating system.

Please give me the general form where there are more player than 2.

Is there a way to modify this formula to apply it other sports such as hockey or baseball where teams often end up winning by more than 1 point? I can’t seem to figure it out and I imagine a team winning by 5 points should impact their rating more than winning by 1.

Thanks,

John

You could adjust the k-factor such that a larger margin of victory results in a higher k-factor and a smaller margin of victory results in a lower k-factor.

Thx.

Hi Great work bro…. one question though : what is the base rating of player who has just started. and secondly in formula for r1′ = 10^ (r1/400) why is it 10 raise to and significance of 400 if any? and can u explain it.

According to my understanding. 400 is the strength i.e who is the strongest person +400 trophies and weakest to you -400 trophies. Power to 10 defines distribution of trophies e.g range of trophies is 0-60 now how would you like user should be getting trophies let say with with power “2.17” distribution according to strength is 29,31,32 VS if 10 is the power the trophies will be 29,33,36.

Nice explanation Bro 🙂

Good job, this was really helpful!

I m still not getting why we used power of 10 and also that divide by 400 plz explain it.

me either 😛

uhh where did the 400 in 10^2400/400 come from

I worked the numbers and it appears from the model using “e” as the exponent and with the appropriate “s” that goes along with this, then converting to base 10 and altering the “s” makes the model look simpler. It does not retain the standard deviation as set (as if was 200 in the base e model…I think it becomes something like 173 if I recall correctly).

Awsome explanation. I wonder how to implement Elo rating system in motorsport races, where more than two player compete in the same match.

Great job! Very clear.

Where can i find a way to rate players in a team sport?

I wish you would provide one explicit example, and in generality per the cdf used.

Let’s say for the single variable there is a standard deviation “s” and then for the sum of cdf, the standard deviation turns into s*sqrt(2). With the convention mean =1500, go from the cdf model directly to a probability.

For example, with mean=1500 and standard deviation 200*sqrt(2), cdf(1700)=0.7 and cdf(1500)=0.5, how would you calculate Pr(1700 player beats 1500 player) regardless of the specific cdf form?