Latest Posts

Topic: Rating system

einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1088
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-09-17, 00:17

So my question is still an open one: can your calculations be applied to participants of the game only without taking into account any other players?


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/

Top Quote
trimard
Joined: 2009-03-05, 22:40
Posts: 200
Ranking
Widelands-Forum-Junkie
Location: Paris
Posted at: 2019-09-18, 16:04

Yes but that's not how glicko is supposed to work and thus the results will be meaningless pretty fast.

I don't see the problem with updating the standard deviation of other players. That's how the system is supposed to work. When you don't play, while other are playing, it becomes harder to see where you are compared to those players. So your standard deviation should go higher.

I propose to update the rating period (and hence also the standard deviation) every x games instead of every x days.


Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1088
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-09-18, 17:57

trimard wrote:

I propose to update the rating period (and hence also the standard deviation) every x games instead of every x days.

This is do-o-cracy, so whatever developer does, it is accepted or not. Probably most of the cases - accepted. So whatever you will do, it will be done face-smile.png .


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/

Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 1257
Ranking
One Elder of Players
Location: GER
Posted at: 2019-09-18, 18:41

trimard wrote:

Yes but that's not how glicko is supposed to work and thus the results will be meaningless pretty fast.

I disagree. The chess site lichess.org works exactly like that, and there the Glicko numbers are not meaningless at all, they are very good.


“It's a threat to our planet to believe that someone else will save it.” - Robert Swan

Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1088
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-09-18, 20:04

Thanks WorldSavior, I was looking for any site that is using Glicko that way. After all I was blind! Currently I am playing with my colleagues on chess.com and they are using Glicko too! ( https://www.chess.com/blog/kurtgodden/elo-to-glicko-your-rating-explained ). And they are not using pure Glicko like Trimard wants. They use modified one-game-only version. If it affects RD for no-playing people? No it doesn't (as far as I know).

Is it still consistent? Yes it is!

Assume that you have players:

Player Rating Rating Deviation
PL1 1000 100
PL2 1000 100
PL3 1000 100

After one game you will have something like:

Player Rating Rating Deviation
PL1 950 70
PL2 1050 70
PL3 1000 100

And when PL2 will play against PL3, the second one will have to recalculate RD to the current state (and it will become 110 instead of 100):

Player Rating Rating Deviation
PL1 950 70
PL2 1020 60
PL3 1040 80

Can you recognise in your code something what was described in "Step 1: Determine RD" on Wikipedia site ( https://en.wikipedia.org/wiki/Glicko_rating_system )? That can solve all the problems we attempt here.

PS.: I was surprised a bit, but even if you play in a tournament game on chess.com, your rank will differ from game to game. So I expect almost nobody in the world is using pure Glicko to determine R and RD values.


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/

Top Quote
trimard
Joined: 2009-03-05, 22:40
Posts: 200
Ranking
Widelands-Forum-Junkie
Location: Paris
Posted at: 2019-09-19, 20:34

Can you recognise in your code something what was described in "Step 1: Determine RD" on Wikipedia site ( https://en.wikipedia.org/wiki/Glicko_rating_system )? That can solve all the problems we attempt here.

No that doesn't look at all like I implemented that, but that's a really neat idea. Didn't think about checking wikipedia for more precise calculus :P. Will see how to calculate that with the current system this weekend.

This is do-o-cracy, so whatever developer does, it is accepted or not. Probably most of the cases - accepted. So whatever you will do, it will be done face-smile.png .

Thanks, I understand that and that's good to hear, but the goal is to have a system with which everyone agree on. Otherwise nobody will use a system that they think is cheated or something. We have to be careful on those decisions IMO.

About lychess and chess.com

Are you guys sure about what you say? Maybe they update your rating deviation with time, but you guys play so much you are never concerned by such changes?

Calculus procedure

Let's sum up a bit our solutions shall we?

Option 1

We calculate the score rating after each game

We update the rating deviation only after you played the game and take into account the last time you used you played:

  • We calculate your new standard deviation as if you didn't play that game and in function of the last time you played. i.e the more time has passed since your last game the more we augment your standard deviation.

  • We then calculate the new rating score and standard deviation, using the aforementioned new standard deviation.

  • your score and you opponent's score are updated

pro:

  • less heavy calculus for server +++

  • ??? You guys seem to like this system a lot, but I'm not sure why tbh. Used in other game so might be more reliable?

con:

  • further away from a real glicko, means more optimization to come with time. Might get us some weird result at some point.

  • Your displayed standard deviation is actually false, and you may end up with a higher standard deviation after than before the game

Option 2

We calculate all the games again, after each of your game, and group them by time period.

After each game:

  • Take all the game by date order.

  • Calculate all group of game one by one, from oldest to newest

  • The last games, which probably won't be enough to make a new time period, are still considered as one time period.

  • All scores are updated

pro:

  • Closer to Glicko

  • Easier to implement for me

  • Your displayed standard deviation is always correct

con:

  • Very server intensive: though a lot of optimizations are possible, but they might bring their lot of cons see below.

  • People see their standard deviation change even when they don't play.

Option 2 prime

Same as above but we only calculate the new score at a precise moment of the day, to avoid constant server load.

con:

  • You'll have to wait one day before seeing your new score after a game

  • Still use a lot of server resources at some point

Option 2 second

Same as option 2 but we only calculate the score after each rating period.

con:

  • require to store two table: one for the current displayed score, one for the score after the last rating period.

  • Require the most work to implement

Option 3

We calculate the games one by one. And don't care about standard deviation changes for player that don't play a lot.

Or we artificially add regularly plus x points to everyone's standard deviation.

pro:

  • Simplicity +++++

con:

  • so far from glicko, it will definitely be lowly representative
Edited: 2019-09-19, 20:37

Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1088
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-09-19, 21:37

I am glad that you understood our point of view. I am not sure how you've solved the procedures now, but if you've done separate table for ratings, we can use this tool to solve all the problems at once.

Assumptions / needs

  1. Players like to update the ratings every game, but as you've pointed, it's not Glicko any more. It is close, but not perfect one.
  2. We don't have many games for rating right now. So the period of real Glicko calculations should be long enough to catch some games.
  3. We have all accepted that Glicko is the best rating model for official games (with strict rules).

Solution

My idea is to keep at least two scores: official and unofficial one.

Official would be what you've already done, calculations for all players, can be computing-intensive (because it is important for us). It can contain very strict rules, we agreed before and will agree after.

Unofficial would be updated for each game players would play and updated only for participants. It can contain any rules we agree for that. Also, very important, we can't show "RD" value alone. It can be shown with note that this value is for the last game OR it can be completely hidden on rank table.

Of course this idea can be expanded to all we want. Do we need separate table for 1 vs 1 games only? Let's add it (in the future). Do we want to see periodic, real Glicko rank for team games (f.e. 2 vs 2) only? Why not adding it too?

Pros

  • Elastic idea, can fulfill all needs
  • Supporting real Glicko and other ranks at once
  • Adding new rank is quite easy

Cons

  • Complex coding needed
  • Sometimes (when updating real Glicko ranks), server CPU intensive

Why do we like?

You guys seem to like this system a lot, but I'm not sure why tbh.

We like it because you have direct feedback after the game. So losing the game means losing points. Winning means gaining. If you have to wait, this influence is not direct and the price is not so high (to your "brain").

But providing both ranks gives another opportunity: in short-term you gain direct reward for your wins, and for long-term (official) you gain real "king" position, which is subtle reward, but stable one. You're rewarded for many wins.


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/

Top Quote
trimard
Joined: 2009-03-05, 22:40
Posts: 200
Ranking
Widelands-Forum-Junkie
Location: Paris
Posted at: 2019-09-22, 19:25

Ok I'm convinced, the best system will probably be a mix of both world:

  • Calculate the new score after each game. Using as much as possible the official score.

  • Every x time, calculate the score of everyone in the more strict way.

Seems pretty straightforward, and doesn't seems to ask for that much work, as most of the tables indeed already allowed for more than one rating system.

I wouldn't display the score differently though. The "official" score shouldn't be too far away from the "unofficial" if we manage to get a real calculus often enough.

The details of these will have to be tested on the alpha site. But I'm happy we know where to go at least face-tongue.png

I will focus on the refacto and fixing bugs people signaled on github for now :).

Can you recognise in your code something what was described in "Step 1: Determine RD" on Wikipedia site ( https://en.wikipedia.org/wiki/Glicko_rating_system )? That can solve all the problems we attempt here.

This won't do, I read it too fast, it's for glicko1 system, we're on glicko2

We like it because you have direct feedback after the game. So losing the game means losing points. Winning means gaining. If you have to wait, this influence is not direct and the price is not so high (to your "brain").

This shouldn't have been a problem with the system I proposed, the problem is mostly the CPU intensity for the server of such system. So yes, we need a mix.


Top Quote