Currently Online

Latest Posts

Topic: Rating system

trimard
Joined: 2009-03-05, 22:40
Posts: 149
Ranking
At home in WL-forums
Location: Paris
Posted at: 2019-07-12, 19:23

aaaahh ok! I get it! Well my table is correct then face-smile.png


Top Quote
trimard
Joined: 2009-03-05, 22:40
Posts: 149
Ranking
At home in WL-forums
Location: Paris
Posted at: 2019-07-15, 11:38

Ok, I've been thinking the last few days. Let's talk a bit about implementation and deadlines. Otherwise the idea will die soon.

First implementation stage

Entirely web based. Allowing us to test the rating system as well without asking too much coding time from the game dev.

Deadline: October fest. I will have only a few hours on sundays, but it should be manageable

A few options:

  • we remove "playtime scheduler" and put something like "let's play" --> index page with "current ratings" and rest of playtime buttons.

  • We add a new "rating" button somewhere on the top right

  • We add a new rating button in the main menu, because we're badasses

Rating page

4 subpages, accessible like in the profile page, on a small menu:

  • 1vs1 rating

  • 2vs2 rating

  • all mode rating

  • upload a game

ratings pages

One table with current rating of everyone from best to least best.

If you are ranked your name appears green and it gives your RD.

If you're not ranked a message "you are not yet ranked in this mode" on top of table.

Upload page

Or maybe change the name, we might not need people to upload the replay everytime?

A form:

  • map

  • type of game

  • player/s against whom you played.

  • player/s in your team

  • upload replay or replays if needed

  • Win? Tie? Lose?

  • exact start game time (the one shown in the first replay)

From map, type and other plays, automatically infer which type of game this is (1vs1, 2vs2, all modes)

--> send message to other player when game has been submitted.

Arbiter pages

He will need a special role on the website.

He sees one more button in the index page for moderation

Set map pool

Get 2 tables, 1vs1 and 2vs2 on which he can add/edit/remove lines for each map with:

  • exact map name in the game

  • Game mode

Arbiter games

List of last submitted game from newest to oldest:

  • submitted by

  • type of game

  • map

  • result

  • link to replay (and number of replays)

  • start time

Add red color when same game was submitted twice?

Options for each game:

  • Delete game from table --> send message to both players

  • punish a player (can't submit game for x hours)

  • send a message to both player

Second implementation stage

If we found our rating system works that's when we add all that has been discussed

Game options

  • surrender button

  • set game for latter

game info send to server after game is played

  • unique hash
  • info of both players their tribes rating at start of the game? (Harder to implement though?)
    • standard deviation, while we're at it
  • status (currently being played, in pause for reschedule, in pause for connection problem)
  • number of interuption by player:
    • player1
    • player2
    • playerN (if needed)
  • result for:
    • player1
    • player2
    • playerN (if needed)
  • Time: When started
  • Time: When ended
  • Time: Last update
  • How many parts of the game (more if the game was interrupted)
  • How long the game last:
    • Gametime(all the part added between themselves)
    • Real time
  • Substatus: win by surrender, win by manual arbiter, ...
  • Widelands version played
  • Map played
  • We'll certainly think of other things to put here in the meantime

automatic

  • Automatically add game result to current rating

  • If one player leaves game and doesn't play after 20 min --> he loses

  • Arbiter get all the infos from the game on a table as described before but now more complete.

  • Arbiter get general game info like best tribes, most played map, etc

Edit: I think I will try integrating the glicko system in python first. It will help me train integrating math for Einstein's calculation afterward

Edited: 2019-07-16, 00:43
Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 1135
Ranking
One Elder of Players
Location: GER
Posted at: 2019-07-16, 14:22

trimard wrote:

I'm impressed by your posts, you seem to do a lot for this.

  • we remove "playtime scheduler" and put something like "let's play" --> index page with "current ratings" and rest of playtime buttons.

Why removing the playtime scheduler? I'm rather against that.

  • We add a new "rating" button somewhere on the top right

Sounds like the best option to me.

Upload page

Or maybe change the name, we might not need people to upload the replay everytime?

I think that replays are more or less necessary, to prevent abuse...

Add red color when same game was submitted twice?

Isn't it easier to make it impossible that games get submitted twice?

trimard wrote:

I didn't want to redo the calculus, because I'm not as good as einstein for these kind of things. So I used this script. I didn't compact series of games together as is recommended in the glicko2 paper. I actually calculated each map 1 by 1. It's not yet clear to me how to do otherwise. Anyone knows btw?

Calculating each map 1 by 1 seems to make a lot of sense.

Constants used (recommended in the initial glicko2 paper):

  • Starting rating: 1500
  • Starting deviation: 350
  • player volatility: 0.06
  • Tau: 1.0 (completely arbitrary, cause I have no idea how to determine which value would best fit. It was by default in the script I used, so I sticked with it)

Result

Player rating deviation
worldsavior 2000.690 171.696
nemesis 26 1725.854 164.302
king of nowhere 1697.907 169.930
mars 1640.066 156.209
einstein13 1639.184 168.751
kaputtnik 1564.446 162.224
trimard 1462.397 167.161
tando 1426.643 156.455
Hasi50 1382.564 171.562
guncheloc 1331.528 213.339
animohim 1133.031 171.307
LAZA 1107.468 195.517

Looks expectable, good. Usually 6 games are not enough to bring a "stable" rating. The chess website Lichess.org considers ratings as "provisonal" (unstable) if the rating deviation is above 110, and the players get a question mark behind their rating in that case, and the ratings cannot appear in leaderboards because they are too uncertain.

einstein13 wrote:

WorldSavior wrote:

How the rating and the RD will change for each player?

As simple as possible: according to Glicko-2 system you calculate for team scores the gains and loses for the game. Then you add the results to all winners and subtract the loses from the opposite side. If you recalculate again (with new scores) the team ranks, they will be as expected: higher or lower by given values. This behaviour was proven in the document in point 4. d).

This sounds unfair to me. Should somebody who is ranked 1000 points below his teammate really get as many points as his teammate, even though he didn't have to do anything for the victory? face-wink.png

And another thing looks weird: A 50 RD guy and a 130 RD guy form a team with less than 50 RD? Shouldn't it be in between?

Yes, I was a bit surprised too, but the standard deviation is sometimes counter intuitive. Let's make an example. Take a wooden stick of length 1 meter. Then you pick something big, like stadium (football, baseball, whatever). Try to measure the size of this stadium by the stick. You will get some number (let it be 535 sticks) with quite high possible error. But you can measure the same thing again (new result would be 523 sticks). And if you collect many of those experiments with high standard deviation, you will be pretty sure that the stadium has length of 530 m with standard deviation less than 1 meter. That is the power of collecting many (independent!) data. Also that is why in our case we get less RD than initial RDs - the system is pretty sure that the new value is correct. I have experimented with second RD and I have found that if it is very high (f.e. 300), the result RD is higher than 50.

Okay, maybe you are right.


“It's a threat to our planet to believe that someone else will save it.” - Robert Swan

Top Quote