Latest Posts

Topic: AI-Training

Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2019-04-28, 11:03

Current game code already has one scoring hardcoded" https://bazaar.launchpad.net/~widelands-dev/widelands/trunk/view/head:/src/ai/ai_help_structs.cc#L562

This is used by me when automatically train AI. However I frequently tinker with scoring algorithm, because it can lead to stupid outcomes. F.e. if military strength is dominant for score, AI learn not to attack at all, because it decreases the final score...

But when you are training manually, you dont need any formal score of course....

I have about 10 different maps, and mix them randomly. Yes, training on some map can "untrain" skill learn on other map, but this can not be helped. Every training is modifying AI DNA only minimally so it evolves bit chaotically but improves generally.

Also note that I train each map for each tribe, so 10 maps means 40 combinations. And for each combination, if you run 15 games, you need 600 games to train to get one new generation from every map x every tribe. And this is from evolutionary point of view not that much....


Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 2091
OS: Linux
Version: Recent tournament version
Ranking
One Elder of Players
Location: Germany
Posted at: 2019-04-28, 18:48

Tibor wrote:

Current game code already has one scoring hardcoded" https://bazaar.launchpad.net/~widelands-dev/widelands/trunk/view/head:/src/ai/ai_help_structs.cc#L562

This is used by me when automatically train AI. However I frequently tinker with scoring algorithm, because it can lead to stupid outcomes. F.e. if military strength is dominant for score, AI learn not to attack at all, because it decreases the final score...

Sounds very problematic.

But when you are training manually, you dont need any formal score of course....

I have about 10 different maps, and mix them randomly. Yes, training on some map can "untrain" skill learn on other map, but this can not be helped. Every training is modifying AI DNA only minimally so it evolves bit chaotically but improves generally.

Is this really better than training the AI on one very hard map which contains most possible problems? It sounds very ineffective that the map gets changed all the time.

Also note that I train each map for each tribe, so 10 maps means 40 combinations. And for each combination, if you run 15 games, you need 600 games to train to get one new generation from every map x every tribe. And this is from evolutionary point of view not that much....

It sounds also ineffective that the tribe has to be changed all the time. Why not training with Frisians only, as they are the hardest tribe?


Wanted to save the world, then I got widetracked

Top Quote
teppo

Joined: 2012-01-30, 09:42
Posts: 423
Ranking
Tribe Member
Posted at: 2019-04-28, 20:06

WorldSavior wrote:

This is used by me when automatically train AI. However I frequently tinker with scoring algorithm, because it can lead to stupid outcomes. F.e. if military strength is dominant for score, AI learn not to attack at all, because it decreases the final score...

Sounds very problematic.

I tried manual training, just to understand this discussion better. After some hundred hours, the players were not attacking each other, which is bad in autocrat..

Also note that I train each map for each tribe, so 10 maps means 40 combinations. And for each combination, if you run 15 games, you need 600 games to train to get one new generation from every map x every tribe. And this is from evolutionary point of view not that much....

It sounds also ineffective that the tribe has to be changed all the time. Why not training with Frisians only, as they are the hardest tribe?

The AI might then forget to build wineries before marble runs out, for example?

Nordfriese wrote:

On the other hand, the default wai files will be improved frequently, but the map´s wai replacement files probably won´t be. At some time the default AI might then perform better than the map´s. I´m against map-specific AI files as it would require too much maintenance to be efficient…

This would, of course, need a way to detect and ignore outdated map-specific WAIs files: Updating all would clearly be impossible.

I still think that map specific AI files would be good, but the penalties are real.


Top Quote
WorldSavior
Avatar
Joined: 2016-10-15, 04:10
Posts: 2091
OS: Linux
Version: Recent tournament version
Ranking
One Elder of Players
Location: Germany
Posted at: 2019-04-28, 20:54

teppo wrote:

WorldSavior wrote:

This is used by me when automatically train AI. However I frequently tinker with scoring algorithm, because it can lead to stupid outcomes. F.e. if military strength is dominant for score, AI learn not to attack at all, because it decreases the final score...

Sounds very problematic.

I tried manual training, just to understand this discussion better. After some hundred hours, the players were not attacking each other, which is bad in autocrat..

Yes, very bad. Hundreds of hours of training or hundreds of hours in a game? face-tongue.png

Also note that I train each map for each tribe, so 10 maps means 40 combinations. And for each combination, if you run 15 games, you need 600 games to train to get one new generation from every map x every tribe. And this is from evolutionary point of view not that much....

It sounds also ineffective that the tribe has to be changed all the time. Why not training with Frisians only, as they are the hardest tribe?

The AI might then forget to build wineries before marble runs out, for example?

One whineyard (why not two?) and one winery are part of the basic economy, which the AI should always build. I don't know if training allows to ignore that basic economy. (?)

Edit: And by the way, afaik the AI is able to handle input and output of its buildings (theoretically). So why shouldn't the AI - in case it becomes strong - be able to recognize that it has a huge demand for marble input (constructionsites), and that it needs a whine output because of this?

Edited: 2019-04-28, 21:06

Wanted to save the world, then I got widetracked

Top Quote
Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2019-04-28, 21:19

Training is not about duration, usually 2 hours for small and up to 5 hours for big map is enough. It is about number of AI players to pick from. F.e. you have 50 players and only 2 of them do something critical, use their DNA for next round and hopefully portion of AI players doing that desired thing will be higher. And repeat and repeat...


Top Quote
hessenfarmer
Avatar
Joined: 2014-12-11, 23:16
Posts: 2646
Ranking
One Elder of Players
Location: Bavaria
Posted at: 2019-04-28, 22:08

From the discussion it seems that there might be some misconceptions about how the AI works and the relations to training.
The AI is not build to simulate a human player. It works in cycles where it evaluates always the same things according to the rules defined and their weighing factors.
Tribe specific special cases are reflected in the AI hints of the tribe which lead to different weighing or overruling some evaluations.
The DNA of an AI consists of the used Weighing factors (magic numbers) and some neurons which decide about the relevant evaluations to take.
In every cycle the AI checks a couple of building spots for the buildings which can be built there. Afterwards it evaluates the necessity of each building possible and calculates a score, If some score is high enough the building gets build. Of course this is just a brief summary of what is going on. However due to the vast amount of numbers and their range it is hard to find optimal values so this takes a lot of time / evolutions. On the other hand the improvement over the last iterations has been very impressive. The risk in training the AI is to choose clever scoring for promoting one generation. In the beginning it was probably about solving specific issues now it should be a more general view but this is harder to evaluate than improvement in one specific issue.
From my point of view at least the risk of losing some abilities is quite low as every "magic number" is affecting just a single aspect so improving in one isue does not necessarily affect any other aspect of decisions.

Wow a lot of text and still so much not explained. Hope this helps anyway. At least it should explain why things are as they are and that any attempt to improve one special ability might not work. This leads to the fact that map specific wai files might improve AI on a specific map, but would increase the training effort in an exponential manner. If at all this should only be attempted after we get stuck with the general AI training.
One idea might be to use the server for AI training as the server is capable to compile and run widelands as far as I know.
Another Idea might be to generate a map of all magic numbers versus affected decision (i always wanted to start this but had no time) to get a clue where to kickstart / boost some improvement in mutation.

Edited: 2019-04-28, 22:10

Top Quote
Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2019-04-28, 22:28

For training you don't need ability to compile. However e.g. on Linux you need x-server or the game will not even start....


Top Quote
Tibor

Joined: 2009-03-23, 23:24
Posts: 1377
Ranking
One Elder of Players
Location: Slovakia
Posted at: 2019-04-29, 08:39

WorldSavior wrote:

Is this really better than training the AI on one very hard map which contains most possible problems? It sounds very ineffective that the map gets changed all the time.

I think it is close to impossible to come up with such map. I think AI needs to be trained on different terrain layouts and with different distance to enemies and so on...

It sounds also ineffective that the tribe has to be changed all the time. Why not training with Frisians only, as they are the hardest tribe?

The same as map. Tribes differs and the AI needs to be tribe-agnostic so why pick just one tribe

BUT - not each map x tribe combination need to be equal, you can set up the training environment so that 50 % of training games will be with frisians and 50% will be on difficult maps. My collections of maps tried to be balanced - to contain various terrains in some reasonable proportions.


Top Quote
GunChleoc
Avatar
Joined: 2013-10-07, 15:56
Posts: 3324
Ranking
One Elder of Players
Location: RenderedRect
Posted at: 2019-04-29, 08:54

teppo wrote:

It sounds also ineffective that the tribe has to be changed all the time. Why not training with Frisians only, as they are the hardest tribe?

The AI might then forget to build wineries before marble runs out, for example?

I have an idea on how we could give the AI some information on whether a building is involved in the production chain for a construction material. I need to get some other branches in first though before I can start working on this.


Busy indexing nil values

Top Quote
hessenfarmer
Avatar
Joined: 2014-12-11, 23:16
Posts: 2646
Ranking
One Elder of Players
Location: Bavaria
Posted at: 2019-04-29, 09:31

Currently it recognizes shortfalls in the economy (high demand low stock) then it looks for buildings producing this shortfall. This generates demand for the inputs of the newbuilt building, which leads to building a building producing that input. However this is somewhat ineffective. Therefore knowledge of the production chain might help. However this would be a new decision (magic number) to be taken into account.
However the described effect of a deadlock is dealed with in the basic economy concept, as this is defined as everything necessary to avoid deadlocks. This concept works very well as you can see with frisians. They are the tribe with the most deadlocks possible and are handled very well by the AI. Additionally Construction materials are already identified by the AI. So almost everything is currently about tweaking the numbers, unless we identify a new aspect that the AI needs to evaluate. The difference between former times and now is that in the past this numbers were hardcoded and adjusted manually while now they are trained in a lot of runs to see which numbers fit best.
I recently tried for example to tweak the numbers to get rid of the weird behaviour of dismantling military buildings near the frontier, but I came just to learn this had already improved in the last training run (the related numbers were touched in this run) and the behaviour now is much better although not ideal. This proves Tibors Concept of genetics. So from my perspective we just need processorcapacity to do the training.


Top Quote