Topic: Rating system

Posted at: 2019-11-23, 23:32

As a chess player, i think the glicko system (which is an improvement of the ELO system) is good.

I think the rating should be calculated after each game, with possible exception of tournaments (the rating evolution should be computed at the end of the tournament). If calculated after each game with only the ratings of the players of the game, it is NOT cpu intensive.

I played on lichess, and if i do not play for a long time, when i come back the deviation is bigger than when i left.

But, if you take a look at how the FIDE rating works, there are a few values for the deviation (called "coefficient"): 40 for young players 20 for experimented players 10 for masters. Once a player get a smaller "coeff", he can not get back a bigger one.

Having only a few possible values for the deviation can make a lot of things easier. (for instance : 200 for the 5 first games, 100 for the next, and 50 if people play more than 10 games in a month ...)

if rating is computed after each game, it is possible to include team games in this rating. (variation has to take account of the average rating of each team, and the square of the "team deviation" should be the average of the squares of the individual deviations)

Maybe team play should lead to 2 different ratings : individual rating for each players, and a rating for the team.

Another thing : in chess, some players think that White pieces have a slight advantage, estimated 50 ELO. It is not taken into account because players play half of the games with white or black. In widelands, some players prefer a color... Maybe, in some maps, a starting position could have a little advantage over the other ones... a record of the maps played, with starting positions and player ratings, should be kept somewhere, just to check the balance of the maps.

Edited: 2019-11-23, 23:49

Posted at: 2019-11-24, 23:09

Current update

I'm working on implementing the user game submission now

@gnarfk

Yes you're right we could do some simplification, but now that glicko is implemented why go backward? :P.

think the rating should be calculated after each game, with possible exception of tournaments (the rating evolution should be computed at the end of the tournament). If calculated after each game with only the ratings of the players of the game, it is NOT cpu intensive.

Well if you want to do pure glicko, you just cannot take into account only the last game. That's the problem. Unless you have some experience with glicko and have an idea for improvment of the current system?

I think we will do a sort of mix of both.

Maybe, in some maps, a starting position could have a little advantage over the other ones... a record of the maps played, with starting positions and player ratings, should be kept somewhere, just to check the balance of the maps.

Yeap that's definitely something we ought to do. And hoppefuly by simplifying how people submit game it will be easier to implement

multiplayer

(variation has to take account of the average rating of each team, and the square of the "team deviation" should be the average of the squares of the individual deviations)

Yes exactly, see Einstein's paper in the first page of this thread, if you want more detail on how he wanted to implement this apsect!

Maybe team play should lead to 2 different ratings : individual rating for each players, and a rating for the team.

Yes like starcraft ?

Edited: 2019-11-24, 23:11

Attachment: Screenshot_2019-11-24_23-02-54.png (964.9 KB)

Posted at: 2019-11-25, 07:59

Well if you want to do pure glicko, you just cannot take into account only the last game. That's the problem. Unless you have some experience with glicko and have an idea for improvment of the current system?

I think we will do a sort of mix of both.

Let me explain the FIDE rating system.

There is the coefficient (which is our "deviation").

When a player meets another, there is an "expected result" calculated from the two ratings. (for instance, player A is 2000 with C_A = 40, player B is 2196 with C_B=20 , A is expected to do 0.25, and B is expected to do 0.75).

The variation for A is the A coefficient multiplicated by the difference between the effective result and the expected result. (If the result of the game is a draw, the effective result is 0.5. The variation for A is Coeff_A x (0.5 - 0.25)= +10 , the variation for B is Coeff x (0.5 - 0.75) = -5 )

One problem of the FIDE system is that it does not take the opponents coeff into account, but that could be fixed ... For instance : coeff_A = 50 * Deviation_A / Deviation_B (where 50 is arbitrary chosen, and can be modified ...)

Edited: 2019-11-25, 08:08

Posted at: 2019-12-08, 17:01

FIDE system seems nice, but I do think glicko is superior in long term.

Though, it seems pretty easy to implement so maybe we could try adding it too at some point? Afterall the code is organized from the start to have many different rating system to compare them

Stage 0 - completed!

Ok, user submission of game is now working as intended in my test. I think it's finally time to test on the alpha website? That way, it would be easier to calculate the current score of the players, as everyone could add games :).

kaputtnik ? Or someone else?

What does work

• a basic glicko system. That can be cpu intensive in theory. And need to be done by the admin

• submission of games and removing of games by admin and users

• Adding tribes, games modes, season

Nb: to add games, users playing in the game need to be added first by the admin, in the alpha.

What need to be worked on soon

• any non 1vs1 game: UI is done, but backend is lacking for the submission. Add page for different mode view. Add calculus from Einstein for the glicko

• Game link submission, submission of game in different parts

• stop the need to start from 0 when you make a mistakes in submission form

• More various fix

• Less CPU intensive glicko submission , with score updated after each added game!

• add FIDE system

Posted at: 2019-12-10, 08:01

glicko is superior to fide system if you want every game in the same period to have the same value.

FIDE-like modified system as i said is superior if you want easy calculation , game by game calculation . in this case, the last games have more importance in the rating. ( exponentially decreasing , by a factor depending on the deviation )

At some point, if you use glicko system with calculation by period, you should think about a minimal number of games to play in order to have the rating replaced by the new one. For instance in French Chess Federation (FFE) rating (which tends to be replaced by FIDE rating), if a player plays more than 9 games, the rating is fully replaced. Else, it is an average between old rating and new one. It is a point that makes me prefer FIDE system. It has a "memory" for old games, which progressively vanishes with new games played, not brutally at the end of a period.

Edited: 2019-12-10, 09:45

Posted at: 2019-12-10, 23:35

I'm sold, we'll have to test that too!

I'm basing myself on glicko and simple ratio for now. But I'll add elo and fide just to see what the difference would be. At that point, we could decide which seem the most appropraite. And as I said, maybe we'll mix some of them.

Anyway, got some new pages made, and though this one might interest you people

and yes, those are test data don't worry

Edited: 2019-12-12, 11:04

Attachment: Screenshot_2019-12-10_23-32-28.png (687.1 KB)

Posted at: 2019-12-11, 19:01

i have been thinking about FFA games. i can think of a way to do that (at least with FIDE system, but it can be adapted to glicko)

for instance :

players : A,B,C and D.

If player A is the only one winning.

Then, A wins against everybody, and the other ones draw each other. The full game will count as it were : A wins against B ; A wins against C ; A wins against D ; B draws C ; B draws D ; C draws D. But each game has to count only to a third of a game (with FIDE, we divide the "coeff" by 3)

This system is also good if there are several winners (in autocrat if 2 players survive but can't finish each other, they draw each other, they win against every other, and the other players draw each other)

Posted at: 2019-12-11, 20:22

trimard wrote:

Anyway, got some new pages made, and though this one might interest you people (img)https://www.widelands.org/forum/attachment/5f5b207a9b581d3d4a76ad831c0a1549049d8eb0/(img)

That's interesting. Suggestions: Include draws and maybe amazons

gnarfk wrote:

i have been thinking about FFA games. i can think of a way to do that (at least with FIDE system, but it can be adapted to glicko)

for instance :

players : A,B,C and D.

If player A is the only one winning.

Then, A wins against everybody, and the other ones draw each other. The full game will count as it were : A wins against B ; A wins against C ; A wins against D ; B draws C ; B draws D ; C draws D. But each game has to count only to a third of a game (with FIDE, we divide the "coeff" by 3)

This system is also good if there are several winners (in autocrat if 2 players survive but can't finish each other, they draw each other, they win against every other, and the other players draw each other)

I'm not sure if this model can be so useful for a ffa-match. After all ffa can be about a lot of luck...

Wanted to save the world, then I got widetracked

Posted at: 2019-12-11, 20:40

WorldSavior wrote:

I'm not sure if this model can be so useful for a ffa-match. After all ffa can be about a lot of luck...

Yes it is. It probably should be an independent rating. But if it is possible, why not ?

No one forces you to play ranked FFA games. (as i said, players should decide before the game if it is ranked or not) But some people might want to play this type of games.

Posted at: 2019-12-12, 09:56

gnarfk wrote:

i have been thinking about FFA games (...) players : A,B,C and D. If player A is the only one winning. Then, A wins against everybody, and the other ones draw each other. (...)

I know that this is one of the possibilities to solve the free for all games, but we already have a mathematical solution for that. Of course we can adopt multiple rating systems with multiple types of calculations, but it has to be created smart, because we don't have many coders .

Unfortunately server that the paper were uploaded is out of order. So anyone who wants to read what is inside, please send me a direct message with an email and I will send this paper there.

einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/
backup website files: http://kartezjusz.ddns.net/upload/widelands/