Announcement

Collapse
No announcement yet.

AI Improvement thread

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AI Improvement thread

    I can only hope Firaxis is reading this stuff and it's helping at all.

    Anyway, this may not be for everyone, I'm weird and enjoy programming and such. This also may get very complicated because weighted feedback neural networks are not something that you can just pick up.

    But here goes: My attempt at creating a meaningful dialog with the aim of improving the AI.

    I've recently had an interesting idea presented to me (by Blake, I believe. I will corect this and give due credit). The idea was to have a number of "tiles of interest" that the AI thinks about. This helps limit the thought chain to a few hundred tiles instead of all of the (128*128) 16384 tiles on the map.

    First we can realise that not all the tiles are going to be of interest, plain ocean tiles half way around the map from me are of no interest. As are any tiles I can't influence.

    So, now that we have reduced the number of tiles we have to think about we can start weighing the decisions. I think we can all agree that we are working toward a weighted feedback nural network or "Learning AI". The problem I come to is feedback. How do we provide meaningful feedback?

    For example, if you stick your hand in a hole and feel a pain, you pull it out and make a decision about that hole. However, if you stick your hand in a hole and thenthree weeks later you get hives you're going to wonder what happened. If we don't provide feedback in a relevant timeframe we loose the association (action = failure)

    Also, since civ is a game of several thousand components spread over several turns of play, if we wait too long we run the risk of punishing for behavior that is acceptable simply because it occured in the same game as unacceptable. So in order to preprocess the AI it must play several thousand games in order to properly weight the decisions. But that means in order to have it compensate for a given user it would have to play thousands more games to overcome the thousands it's played already.

    This is usually handled in a Short term/Long term feedback environment. The feedback is processed by the Short term memory with a rather severe weighting. But the weight has a decay value. So after each game the weight comes down, not all the way but some. The same goes for Long term but with much smaller weight values. The long term memory has a transfer period and transfer ratio as well, now if the short term memory value is over the threshold for the transfer period then it gets added to the long term memory acoridng to the transfer ratio.

    This helps mimic short term experiences but allows for them to go away while a trend will still influence the long term playing.

    So who can take it from here?

    Any ideas on feedback mechanisms?

    Movement weight values?

    Classes of decision? (Unit movement, city build, GP use, peace/war, etc.)

    Tom P.
Working...
X