Latest Posts

Topic: Rating system

king_of_nowhere
Avatar
Joined: 2014-09-15, 18:35
Posts: 1668
Ranking
One Elder of Players
Posted at: 2019-07-05, 13:54

king_of_nowhere wrote:

GunChleoc wrote:

  • It is immpssible to distinguish people quitting the game when they are losing (cheat) from internet connection lost (no cheat)
  • If players lose their internet connection, they are asked to start a new game and resign to give the other player their points

well, there could be a system where you have some time to reconnect. So when you crash, you have, say, 10 minutes to come back in lobby and rehost from save. if you do, the game continues. if you don't, you lost. if you host and your opponent does not show up, your opponent lost.

This is so much better than the system of 0AD (if one cannot rehost more than a few times, let's say 2 or 3, to avoid abuse of that rule). I can not understand how there can be such a mistake in the system of 0AD...

At the other hand one could also think about using the system of your tournament which allows bigger breaks than 10 minutes (rescheduling).

that could be another feature of the ranked option. "finish later", an option requiring both players to consent that will allow them to leave and keep playing later. if they don't show up in, say, two weeks, the game is automatiically considered a draw.


Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1118
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-07-05, 15:08

king_of_nowhere wrote:

(...)

that could be another feature of the ranked option. "finish later", an option requiring both players to consent that will allow them to leave and keep playing later. if they don't show up in, say, two weeks, the game is automatiically considered a draw.

Yes! And there should be an option for disconnect: "finish later" for a player who has left. If he/she will see that the other one disconnects because of other reasons (f.e. Internet connection or emergency), the game can be rescheduled and played further.


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/
backup website files: http://kartezjusz.ddns.net/upload/widelands/

Top Quote
trimard
Avatar
Topic Opener
Joined: 2009-03-05, 22:40
Posts: 230
Ranking
Widelands-Forum-Junkie
Location: Paris
Posted at: 2019-07-05, 17:38

Ok, it's true it would be better to have a "finish later" button, but I don't think it's a "must have" feature, I think it could be implemented on a second time? GunChleoc seems to say it would be preferable everything is installed int eh first install though.

@einstein13

First, I don't know if lower means better or worse in Glicko system. Second, I haven't told that simple average is the best here. It can be any type of average, including our own, based on Glicko formulas. Third (and probably it is the most important), teams means team work. If strong player wins over two weaker ones and the weakest didn't help, that is the problem of their team, not the rank - that is my opinion. And probably if you ask king_of_nowhere, he will admit that even killing one good soldier can sometimes say "yes" or "no" to win. So even weak player can help the strongest of us. That is how I understand the situation.

More points means better in glicko.

Ok I'm convinced about team play, you make a good point for it face-tongue.png For simplification I think a global average would work pretty good at first. At worst it will mean changing some variable afterward, no big deal.

Different ladders

It seems everyone here is voting for 2vs2, 3vs3, and 4vs4 ladder too. So why not test with that.

I disagree with a difference between headquarter and the likes. The score should only take into account who won or lost. Any other data would be hard to balance.

The connection problem and rescheduled games

For connection problem I don't have strong opinion on 1 hour or 10 minutes. Maybe 1 hour is a bit long to let a player waiting for the new game though? Rescheduling a game would be awesome! So yes that would mean a new button, I think two weeks is pretty reasonable, maybe one week better? We have to keep in mind, the longer this delay, the less representative your score will be, because your level might change in between (might not be such a big deal after all)

I.e all that would add some work for the game itself:

  • Adding a "abandon button": quit game and make other player victorious
  • Adding a "reschedule game": ask other player if they want to reschedule, if yes, game is saved and quit.
    • Add a counter on the server for that game. on the webserver??
    • When counter is up, game is finished on a tie
  • Adding a hash for each game and info about it:
    • name of both players, their tribes?
    • status (currently being played, in pause for reschedule, in pause for connection problem and who's the player responsible, won/lose/tie by x player)

Website side

Some work needed too for an admin interface:

  • list of all games and their status
  • list of all games done by player (to check eventual cheating between recurring same players)
  • Capacity to change who won or lost a game
  • List of uploaded replays

Apart from what's already been described for the player interface

  • add capacity to upload replays?
  • At least 4 page for each 1vs1, 2vs2, 3vs3, 4vs4

Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1118
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-07-05, 20:08

More points means better in glicko.
For simplification I think a global average would work pretty good at first.

So I will prepare some calculations for teams to see how it works. Hope I will find some time next week.

Please answer some questions:

  • How much have best players with this system (for ELO I know it is around 2800) ?
  • What are the initials we will apply here?
  • What are the worst points I should consider (I mean the worst player here)?

The answers don't have to be very accurate. I need them for examples only. face-smile.png

Also I would like to "simulate" 1 vs 2 game. Let's imagine World Savior against 2 average people. I guess that World Savior's friend would be "no AI" computer player, so IQ=0. What points amount should I simulate for his friend? 0? 1/2 of initials? Initials? Those considerations are only to show if our system will work also for other types of games.

Unfortunately I don't know how to solve "1 vs 1 vs 1" game yet, but I will work on that in the background. If I will find proper general rules, all problems would be solved. face-smile.png

Different ladders

It seems everyone here is voting for 2vs2, 3vs3, and 4vs4 ladder too. So why not test with that.
(...)
* At least 4 page for each 1vs1, 2vs2, 3vs3, 4vs4

I disagree with that, but if most of us wants it, why not?
My point of view is current state: not every player wants to play in teams and team games are played not so often, so 1 vs 1 ladder would be quite accurate, while others would not. My suggestion is to make a ladder 'all types' which will show total numbers for all types of games.

  • Adding a hash for each game and info about it:
    • name of both players, their tribes?
    • status (currently being played, in pause for reschedule, in pause for connection problem and who's the player responsible, won/lose/tie by x player)

Add also:

  • Time:
    • When started
    • When ended
    • Last update
  • How many parts of the game (more if the game was interrupted)
  • How long the game last:
    • Gametime
    • Real time
  • Substatus: win by surrender, win by manual arbiter, ...
  • Widelands version played

Why? The more we save about the game, the more likely will be to catch cheating

Also providing all general data would be helpful if somebody in the future would like to recalculate the stats OR change them completely (for example to ELO ones).

  • At least 4 page for each 1vs1, 2vs2, 3vs3, 4vs4

I suggest one page (django view) where you have several tabs and you pick what you want to display. I even can think about the model where stats will be stored:

class Stats(...):
    player/user - one to one with User
    rank_type - integer (choice: Glicko 1vs1, Glicko 2vs2, Glicko 3vs3, ..., Glicko all_games, Glicko team_games, ...)
    main_value - integer (if float needed, just multiply by 1000 and use 3 digits after decimal point)
    secondary_value - integer (RD for Glicko); not mandatory for all rank types

If the player finish at least one game in proper rank, he/she should be displayed on rank, so new record of this class should be created.

So if we store both all games stats (results) and ranks separately it will be easier to change something with equations and quite fast to display the data.


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/
backup website files: http://kartezjusz.ddns.net/upload/widelands/

Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 2091
OS: Linux
Version: Recent tournament version
Ranking
One Elder of Players
Location: Germany
Posted at: 2019-07-05, 21:46

king_of_nowhere wrote:

king_of_nowhere wrote:

GunChleoc wrote:

  • It is immpssible to distinguish people quitting the game when they are losing (cheat) from internet connection lost (no cheat)
  • If players lose their internet connection, they are asked to start a new game and resign to give the other player their points

well, there could be a system where you have some time to reconnect. So when you crash, you have, say, 10 minutes to come back in lobby and rehost from save. if you do, the game continues. if you don't, you lost. if you host and your opponent does not show up, your opponent lost.

This is so much better than the system of 0AD (if one cannot rehost more than a few times, let's say 2 or 3, to avoid abuse of that rule). I can not understand how there can be such a mistake in the system of 0AD...

At the other hand one could also think about using the system of your tournament which allows bigger breaks than 10 minutes (rescheduling).

that could be another feature of the ranked option. "finish later", an option requiring both players to consent that will allow them to leave and keep playing later. if they don't show up in, say, two weeks, the game is automatiically considered a draw.

Good idea to make it optional

trimard wrote:

The connection problem and rescheduled games

For connection problem I don't have strong opinion on 1 hour or 10 minutes. Maybe 1 hour is a bit long to let a player waiting for the new game though?

Yes, it might be too long.

Rescheduling a game would be awesome! So yes that would mean a new button, I think two weeks is pretty reasonable, maybe one week better?

I'd say that one week is better

einstein13 wrote:

  • How much have best players with this system (for ELO I know it is around 2800) ?

Depends on the kind of game and on the number of players. Here at a big chess site the top numbers are approximately between 2300 and 3100, depending on the variant of chess: https://lichess.org/player

  • What are the initials we will apply here?

On that chess site, the players start with 1500 rating, so we could pick the same.

  • What are the worst points I should consider (I mean the worst player here)?

The chess site had - for what reason ever - a limit: 800 points. But too many players had 800 points, and I guess that the limit has recently been changed to 600 (probably for that reason).

Theoretically there has to be no limit, and negative numbers are also possible?

The answers don't have to be very accurate. I need them for examples only. face-smile.png

Also I would like to "simulate" 1 vs 2 game. Let's imagine World Savior against 2 average people.

In that case I win on almost all maps safely. But what would be my rating number? I don't know...

I guess that World Savior's friend would be "no AI" computer player, so IQ=0. What points amount should I simulate for his friend? 0? 1/2 of initials? Initials?

You mean rating points? In that case maybe 600 or less?

Unfortunately I don't know how to solve "1 vs 1 vs 1" game yet, but I will work on that in the background. If I will find proper general rules, all problems would be solved. face-smile.png

I think that it's just unsolvable. The rating is for matches with exactly two teams, not three or more.


Wanted to save the world, then I got widetracked

Top Quote
king_of_nowhere
Avatar
Joined: 2014-09-15, 18:35
Posts: 1668
Ranking
One Elder of Players
Posted at: 2019-07-06, 02:47

einstein13 wrote:

More points means better in glicko.
For simplification I think a global average would work pretty good at first.

So I will prepare some calculations for teams to see how it works. Hope I will find some time next week.

Please answer some questions:

  • How much have best players with this system (for ELO I know it is around 2800) ?
  • What are the initials we will apply here?
  • What are the worst points I should consider (I mean the worst player here)?

The answers don't have to be very accurate. I need them for examples only. :)

assuming the same algorithms used in chess are used here, then it is a logarithmic scale. 350 points of difference mean that the stronger player is expected to score 90% against the weaker. In chess the scale goes for almost ten orders of magnitude. It has been speculated that god (perfect game) would be somewhere around 3500. it would be impossible to go above that, as it would be impossible to score over 50% against another 3500 (all games would be draws, unless white can force a win somehow, then white would win all games).

Widelands is an entirely different game, and there is no prediction what limits it could have. On one hand, there is a random factor that limits what pure skill can consistently accomplish. On the other hand, it is possible to win 1v2 against decent opponents, while in chess that would be unthinkable even in the case of the world champion against regular club players.

I think worldsavior is pretty close to the strongest possible. I should be 200 to 300 points below him (25% score is 200 points of difference; I did win one game against ws, lost 3 or 4). The next top players (mars, einstein) should again be around 200-300 points below that. the weaker players could easily be 1000 points below

Different ladders

It seems everyone here is voting for 2vs2, 3vs3, and 4vs4 ladder too. So why not test with that.
(...)
* At least 4 page for each 1vs1, 2vs2, 3vs3, 4vs4

I disagree with that, but if most of us wants it, why not?

it seems everyone wants different ladders? strange, I have quite the opposite impression. And team games, especially above 2v2, are too rare to have a dedicated ladder.

It is also truue, though, that worldsavior would defeat moost players 1v2, so taking a simple average of the ranks doesn't really work.

Or maybe it does? by my estimates before, we have some 2000 points between stronger and weaker player. two players in the middle would average like ws and the very worst player.

But then, it is possible to do 1v7 against AI. so no, it does not work.


Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 2091
OS: Linux
Version: Recent tournament version
Ranking
One Elder of Players
Location: Germany
Posted at: 2019-07-07, 16:21

king_of_nowhere wrote:

I think worldsavior is pretty close to the strongest possible.

Thanks

On the other hand, it is possible to win 1v2 against decent opponents, while in chess that would be unthinkable even in the case of the world champion against regular club players.

How would such a chess match look like? 16 pieces vs 32, on which kind of board, which moving rules?


Wanted to save the world, then I got widetracked

Top Quote
BoeseKaiser
Avatar
Joined: 2019-02-21, 11:03
Posts: 41
Ranking
Pry about Widelands
Posted at: 2019-07-07, 19:39

On the other hand, it is possible to win 1v2 against decent opponents, while in chess that would be unthinkable even in the case of the world champion against regular club players.

How would such a chess match look like? 16 pieces vs 32, on which kind of board, which moving rules?

https://en.wikipedia.org/wiki/Three-player_chess

I played that a couple of times but I find it rather uninteresting. I'd like seeing a GM (preferably Eric Hansen) play against 2 random opponents though.


Top Quote
trimard
Avatar
Topic Opener
Joined: 2009-03-05, 22:40
Posts: 230
Ranking
Widelands-Forum-Junkie
Location: Paris
Posted at: 2019-07-08, 00:51

Nice!!

Ok, so integrating what's just been said:

Storing info of a game

Must contain:

  • unique hash
  • info of both players
    • their tribes
    • rating at start of the game? (Harder to implement though?)
    • standard deviation, while we're at it
  • status (currently being played, in pause for reschedule, in pause for connection problem)
  • number of interuption by player:
    • player1
    • player2
    • playerN (if needed)
  • result for:
    • player1
    • player2
    • playerN (if needed)
  • Time: When started
  • Time: When ended
  • Time: Last update
  • How many parts of the game (more if the game was interrupted)
  • How long the game last:
    • Gametime(all the part added between themselves)
    • Real time
  • Substatus: win by surrender, win by manual arbiter, ...
  • Widelands version played

Different ladders

Ok sorry, misunderstand what you guys said, so it would be more:

  • 1vs1 ranked
  • All modes ranked
  • All modes unranked
  • What about a 2vs2 ranked? (If that one doesn't work it's sure the higher number won't either, so it could be a good test)

So are you guys ok with the idea of having selected map and mode for 1vs1 and 2vs2? We could make a simple vote on the forum to select the maps and modes that are proposed. There is a voting feature on the website IIRC

Interface

I suggest one page (django view) where you have several tabs and you pick what you want to display. I even can think about the model where stats will be stored:

Yes perfect!

Points and limit

For the extremes, I think it really depends on what psychological aspect we want to take into account. At some point, you don't want to give negative point to a player. I mean, seriously. If that happens to me, I can guarantee I stop playing right there.

So at least makes a bottom of 600 if that's seem fair in other games!

assuming the same algorithms used in chess are used here, then it is a logarithmic scale.

From what've understand, that's exactly as you described. But I can't guarantee you that the same scale apply. No idea TBH. We should make some calculus test. I think end of the tournament would make a good starting data table btw! Maybe we could use the old ones too for test.

Huh, ok I'll try in the next few days!

Unfortunately I don't know how to solve "1 vs 1 vs 1" game yet, but I will work on that in the background. If I will find proper general rules, all problems would be solved. face-smile.png

If you manage that would be awesome. I have no idea about that.^^

Edited: 2019-07-08, 00:54

Top Quote
einstein13
Avatar
Joined: 2013-07-29, 00:01
Posts: 1118
Ranking
One Elder of Players
Location: Poland
Posted at: 2019-07-08, 02:01

Hey!

I was able to make first attempt to 2 vs 2 game problem with small calculations for 1 vs 2 problem too. Everything you can find in this file:
http://wuatek.no-ip.org/~rak/widelands/docs/MultiplayerRatingSystem/MultiplayerRatingSystem_0.1.pdf
(also available on my site)

Standard arithmetic average is not enough for all possibilities. For almost equal rating - yes. For bigger differences - no. Unfortunately the equations seems to be complicated ones (but for computer they are quite easy and they don't rely on any complex functions!). The next step is to make a general recipe for any two team games. Also calculating Standard Deviation from the equations should be another step to finish this part. I am sorry, but any further steps would be more complex that this one face-sad.png .

Storing info of a game (...)

What about the map the players took? I think that it is not needed for ratings, but it can be useful to see which maps are used the most and what positions are better than others. Just an idea face-smile.png .


einstein13
calculations & maps packages: http://wuatek.no-ip.org/~rak/widelands/
backup website files: http://kartezjusz.ddns.net/upload/widelands/

Top Quote