Latest Posts

Topic: Improving the AI

Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2020-04-26, 15:54

Currently AI player calculates score every 10 minutes and prints it in log output. So the training script parses the game output and seek for latest score for each player.
If it was in LUA I can imagine calling LUA once (for each AI player) before the end of game and than quitting the game.


Top Quote
teppo

Joined: 2012-01-30, 09:42
Posts: 423
Ranking
Tribe Member
Posted at: 2020-04-26, 17:39

Tibor wrote:

Currently AI player calculates score every 10 minutes and prints it in log output. So the training script parses the game output and seek for latest score for each player. If it was in LUA I can imagine calling LUA once (for each AI player) before the end of game and than quitting the game.

Why not call the lua scorer once every 10 minutes? It cannot be too inefficient for that.

I guess that when mutating genes, some AIs get fail early. I a seafaring map, for example, one could set training waypoints like harbor and ships are needed within some time, else darwin award.

hessenfarmer wrote:

Branch / Pull request is ready for review. Would be fine if somebody else then Tibor and me could test this.

What kind of behavior should one observer in that branch?


Top Quote
Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2020-04-26, 17:48

teppo wrote:

Why not call the lua scorer once every 10 minutes? It cannot be too inefficient for that.

Because you need just one score for each player. In C++ the frequency is hardcoded as you dont know the duration of games beforehand.

I guess that when mutating genes, some AIs get fail early. I a seafaring map, for example, one could set training waypoints like harbor and ships are needed within some time, else darwin award.

For this purpose you can set shorter training games - like 90 minutes and evaluate best performing player then.


Top Quote
hessenfarmer
Avatar
Joined: 2014-12-11, 23:16
Posts: 2646
Ranking
One Elder of Players
Location: Bavaria
Posted at: 2020-04-26, 17:51

teppo wrote:

Tibor wrote:

Currently AI player calculates score every 10 minutes and prints it in log output. So the training script parses the game output and seek for latest score for each player. If it was in LUA I can imagine calling LUA once (for each AI player) before the end of game and than quitting the game.

Why not call the lua scorer once every 10 minutes? It cannot be too inefficient for that.

I guess that when mutating genes, some AIs get fail early. I a seafaring map, for example, one could set training waypoints like harbor and ships are needed within some time, else darwin award.

Sounds promising.

hessenfarmer wrote:

Branch / Pull request is ready for review. Would be fine if somebody else then Tibor and me could test this.

What kind of behavior should one observer in that branch?

  1. The AI should not waste it's whole power until it has no soldier left to attack.
  2. It should not attempt completely useless attacks (1 soldier against a building with at least 2 soldiers) - although this is not fully trained.
  3. AI behaviour in general should not be worse than in current trunk.
  4. Any observations of weird AI behaviour though maybe unrelated.

to test I'd recommend watching the AI in a local multiplayer as observer - means filling all slots of a map with AI or close them.


Top Quote
teppo

Joined: 2012-01-30, 09:42
Posts: 423
Ranking
Tribe Member
Posted at: 2020-04-26, 18:49

Tibor wrote:

Why not call the lua scorer once every 10 minutes? It cannot be too inefficient for that.

Because you need just one score for each player. In C++ the frequency is hardcoded as you dont know the duration of games beforehand.

I still do not quite get it. The lua scorer could, among its output, return a boolean like "final score (true) / undecided, go-on please (false)". Done like that, you do not know the duration beforehand and still get only one score for each player, and do not run longer than necessary. The interface between lua scorer and C++ part has not been discussed yet. What kind should the final score be, relative to other players in this round or some universal performance goodness number?


Top Quote
Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2020-04-26, 19:04

teppo wrote:

Tibor wrote:

Why not call the lua scorer once every 10 minutes? It cannot be too inefficient for that.

Because you need just one score for each player. In C++ the frequency is hardcoded as you dont know the duration of games beforehand.

I still do not quite get it. The lua scorer could, among its output, return a boolean like "final score (true) / undecided, go-on please (false)". Done like that, you do not know the duration beforehand and still get only one score for each player, and do not run longer than necessary.

Usually 4 AI players with own mutations are playing. Even if we decide that one of two players got stuck, remaining can go on. Also "stuck" players can get various scores too.

The interface between lua scorer and C++ part has not been discussed yet. What kind should the final score be, relative to other players in this round or some universal performance goodness number?

Well, scores are just relative. You run 12 games (in my case) with 4 AI on each, and for each position you pick one AI with highest score and generate new 4 wai files for next generation. One of reason for this is that positions are not equal on most maps.


Top Quote
hessenfarmer
Avatar
Joined: 2014-12-11, 23:16
Posts: 2646
Ranking
One Elder of Players
Location: Bavaria
Posted at: 2020-04-28, 09:54

Just bumping this up again.

It would be really nice if someone or even more people could get the Branch tested.
windows build to be found here https://ci.appveyor.com/project/widelands-dev/widelands/builds/32450333/job/tku5xbaqdrvnkat7/artifacts
git branch is here https://github.com/widelands/widelands/tree/AI_attack_fixes

Without proper testing this will not make its way to b21 if we can get some feedback we might have it probably.

Test goals are

The AI should not waste it's whole power until it has no soldier left to attack.
It should not attempt completely useless attacks (1 soldier against a building with at least 2 soldiers) - although this is not fully trained.
AI behaviour in general should not be worse than in current trunk.
Any observations of weird AI behaviour though maybe unrelated.

Best would be to run a four player Ai game (using each existing tribe) on a reasonable map (i normally use full moon) and compare behaviour of Ai with current trunk ( or any older version available) to the one in the branch with focus on attack management.


Top Quote
hessenfarmer
Avatar
Joined: 2014-12-11, 23:16
Posts: 2646
Ranking
One Elder of Players
Location: Bavaria
Posted at: 2020-05-01, 11:28

Changes got into trunk. Training is underway. Thanks to all supporters in this task.

after b22 I'll have a look into other open points. Observations for this are very welcome.


Top Quote