Announcement

Collapse
No announcement yet.

Testing the civ3 random number generator

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing the civ3 random number generator

    Since Hurricane mentioned in another thread what the basic pseudo-random number generator used by civ3 was, I decided to test it for 'streakiness'.

    First off, the exact algorithm wasn't revealed, but linear congruent generators all work in the same way AFAIK. My implementation was as follows (in C):

    unsigned int seed; /* global variable */

    unsigned int prng(void)
    {
    unsigned long long dummy;
    unsigned int tmp;

    dummy = ((unsigned long long) seed * 1103515245) + 12345;
    seed = dummy & 0xffffffff;
    tmp = seed & 0xffc00000;
    return(seed >> 22);
    }

    (on my machine, int is 32 bits, long long is 64 bits).

    The bottom 32 bits of 'dummy' are kept as the next seed, and the higest 10 bits of seed are returned are to generate a random number between 0 and 1023 inclusive. As I understand it, this is how the civ3 PRNG works. If any of the programmers wish to correct me...

    This PRNG generates a nice flat distribution, with all values being equally likely. The question for combat in civ3 is whether the generator is streaky - whether it has a tendency to produce say 3 or 4 low numbers in a row. Of course there are various tests that can be done on this, but I set it up to simulate 100,000,000 combats in civ3 between a 4 hp, 6.0 attack unit attacking a 4 hp 2.75 defense unit (i.e. an infantry attacking a fortified spearman across a river).

    The test is to compare the final hp's left for the winner vs. the distribution expected if the numbers were perfectly random. If there is any streakiness in the random numbers, then we expect to see an excess of results where one of the units is left on 4 hp, and a defecit of results where the surviving unit is left on 1hp (a fact which I confirmed by running 10,000,000 combats with a PRNG with streakiness deliberately inserted).

    The results for 100,000,000 battles fought (numbers given are the percentage chance of a given outcome).

    spearman
    hp left expected prob observed prob
    1 6.249 6.249 +/- 0.003
    2 4.551 4.550 +/- 0.002
    3 2.652 2.650 +/- 0.002
    4 0.966 0.967 +/- 0.001

    infantry
    hp left
    1 13.686 13.691 +/- 0.004
    2 21.829 21.824 +/- 0.005
    3 27.854 27.855 +/- 0.005
    4 22.214 22.215 +/- 0.005

    The uncertainties quoted in the observed probabilities are 1 sigma errors (meaning that you'd expect the 'expected' and 'observed' values to match within this error in roughly 2/3 of the cases).

    Pretty obviously, there is no significant deviation away from the prediction of purely random numbers (i.e. no streakiness). I also did a simulation with the spearman fighting 2 attackers in a row, which also showed no significant deviation from the expected distribution.

    So, assuming my version of the civ3 PRNG is accurate, it is pretty safe to say that there is no undue streakiness causing strange combat results in civ3; whatever streakiness there is in the generator doesn't show up in 100,000,000 combats (probably 20,000-200,000 games worth, depending on how often you go to war).


    So, if you are convinced that you are getting unusual results too often, then you'll have to chalk it up to either a bug or a deliberate cheat on the part of the programmers (or more plausbily, the fact that our 'natural' understanding of randomness *very* badly underestimates the number of long streaks of odd results you can get).

  • #2
    Very interesting. As I expected, the random number generator provides random numbers over a huge number of combats. The question is whether the generator is "streaky" over a much smaller number? Is there a tendency in a 50-50 proposition for the generator to assign clusters of hits to one side or another? With a very large n the numbers may appear random, but that doesnt mean there isnt clustering of the numbers.
    We need seperate human-only games for MP/PBEM that dont include the over-simplifications required to have a good AI
    If any man be thirsty, let him come unto me and drink. Vampire 7:37
    Just one old soldiers opinion. E Tenebris Lux. Pax quaeritur bello.

    Comment


    • #3
      Vulture,

      In the example you used, the results look correct.

      Have you tried with more equally matched attack and defensive strengths? And I see that you did not include the 10% terrain bonus the defender gets for just standing there.

      In the 8 months or so of game play, I have noticed there is a definite pattern to the combat results. There will be strings where the defending unit is down to it's last hp, while the attacking unit has 3 or 4 hp, and the attacking unit will die, so your next attacking unit will certainly kill the defender.

      In naval combat the results are totally predictable: if you attack a unit of equal defensive strength, 7 out 8 times, your unit dies, often without doing even 1 hp of damage. The defender always has that 10% defensive bonus, which apprears to make all the difference in the world.

      I am going to start recording the combat results as evidence of this, since it occurs at all difficulty levels. I think 10 games played to conclusion should generate about 200+ combat results per game and that should make for a significant sample.

      And you could very well be right that there is no bias in the combat when all factors are taken into account, but only when you take into account all of the combat between AI Civ's as well. Only seeing your own results may be the source of our bias on the randomness of the results.


      D.
      "Not the cry, but the flight of the wild duck,
      leads the flock to fly and follow"

      - Chinese Proverb

      Comment


      • #4
        Recording combat results seems a good thing. I'm sometimes awed by the results... how could have my Longbowman survived an attack of a Swodsman, for instance?
        Solver, WePlayCiv Co-Administrator
        Contact: solver-at-weplayciv-dot-com
        I can kill you whenever I please... but not today. - The Cigarette Smoking Man

        Comment


        • #5
          Originally posted by SpencerH
          Very interesting. As I expected, the random number generator provides random numbers over a huge number of combats. The question is whether the generator is "streaky" over a much smaller number? Is there a tendency in a 50-50 proposition for the generator to assign clusters of hits to one side or another? With a very large n the numbers may appear random, but that doesnt mean there isnt clustering of the numbers.
          En contraire, the point of this test was exactly to look at clustering on small scales. The fact that I used a lot of results may obscure this, but it's still true (unless I'm hopelessly wrong). The point is that I am simulating combat between 4 hp units. Streakiness in the range of 2-4 numbers will show up most strongly in this region.

          The first test I did, checking that each of the integers 0-1023 was generated with equal probability, is the test that your complaint would be valid for. More subtle defects in PRNGs produce e.g. a tendency to produce pairs of similar numbers, or a set of 3 numbers might be preferentially followed by a particular set of another 3 numbers (the sequence 576 431 972 might be followed by 543 1010 966 more often than expected by chance - and actually with a linear cogruent generator this is likely to be the case (although obviously not necessarily with the numbers I picked here)). The combat test here is sensitive to those kind of defects to some degree, at least to the kind where one unit will lose several hp in a row more often than you would expect.

          As I mentioned, I did a second test where the spearman fought infantry in two consecutive battles (to test the proposition that a spearman that wins one implausible battle (taking minimal damage) is quite likely to win the next one as well - something which in my games certainly feels as though it is true sometimes). The test showed that a spearman won against two consecutive attacks no more often than would be expected, and that the hp-distribution of the spearman in the battles it survived matched the predictions. If there was any streakiness that showed up at the level of a spearman winning two fights in a row, this would have been noticable (it would increase the number of times the spearman survived with 3 or 4 hp f'rinstance).

          The point is that any kind of clustering (except in high dimensions, where linear congruential generators are known to perform very badly) would show up in these tests. The higher dimensions don't show up exactly because I am testing clustering on shorter scales, by essentially studying the properties of small groups of random numbers (i.e. what I'm testing is what you suggest I should be testing).

          Comment


          • #6
            Originally posted by Gen.Dragolen
            Vulture,

            In the example you used, the results look correct.

            Have you tried with more equally matched attack and defensive strengths? And I see that you did not include the 10% terrain bonus the defender gets for just standing there.
            I used unbalanced stats because I wanted to test the proposition that streaks on unlikely results are over-represented; it seemed easiest to do that by having a scenario where one of the combat results was obviously more unlikely. And yes, I forgot one of the defensive bonuses, and gave 10% for terrain and 25% for fortifying, and that was it. Mea culpa.

            In the 8 months or so of game play, I have noticed there is a definite pattern to the combat results. There will be strings where the defending unit is down to it's last hp, while the attacking unit has 3 or 4 hp, and the attacking unit will die, so your next attacking unit will certainly kill the defender.
            There are bound to be some kinds of oddities in the combat system. The seed is 32 bit. This means that there are only 2^32 possible sequences of numbers (or less if the PRNG is not well chosen). I am told that the civ3 combat system generated numbers in the 0-1023 range. Now consider a sequence of 4 numbers in the range 0-1023 - the should be 2^40 possibilities. But our PRNG only has 2^32 possible states. What does this mean in practice? It means that if you look at the first three numbers in the sequence, there are only 4 possible values for the last number rather than 1024. For a typical fight, say involving 6 rounds of combat, there are only 2^32 different outcomes from the PRNG rather than the 2^60 that their ought to be if it was truly random. In practice for civ3, this doesn't make a huge difference it seems, since there are only 8 observable different outcomes to the infantry-spearman fight. And only 2^6 sequences of hp loss. So as long as the 2^32 PRNG states map more or less equally between these observable states, there is not a problem. Problems can arise when the limited number of strings that can actually be generated are not 'evenly' distributed amongst the theoritcally possible strings (in the example above, with only 1/256 of the possible 4 number sequence actually being produced by the PRNG, it is possible that the strings generated are clustered in certain parts of the phase space; hopefully the constants used in the PRNG are chosen to avoid this kind of clustering as much as possible).

            Determining just where the limitations of the PRNG start to show up isn't easy, which is why I did the simulation. I expect that if you look at the distribution of results of 4 or 5 successive combats you will start to see breakdowns occur, with some sequences of results being impossible. But I doubt that this is going to be anywhere near the kind of effect to produce the subjective impression of streakiness that many people seem to have (and as I mentioned elsewhere, I get it too, but I'm more inclined to consider it more a problem of perception than of the PRNG).

            Keeping notes on the combat results in games would be a good test, as long as you are very careful to decide before hand what criteria you are going to use for what to record. When I tried it I ended up only recording my attacks on the AI, 'cos I knew I could take the time to record what happened, while when you are the defender you don't always have time, and this probably introduces a bias towards certain types of result.

            BTW only 200 combats in a game? Are you some kind of raving pacifist or what!?

            Comment


            • #7
              Originally posted by vulture

              by essentially studying the properties of small groups of random numbers (i.e. what I'm testing is what you suggest I should be testing).

              100,000,000 combats is a small number ? Since my small experience with stats is in small numbers say 5-10 subjects/group with 3-4 groups maybe I have an advantage in looking at anomolies

              I'm thinking of clustering in this way, we're familiar with the concept that you can compare two groups and come up with a probability whether or not they are the same. In some cases the they can appear different but that difference may disappear as n becomes larger (and obviously the reverse occurs). So what I'm suggesting (and I may be out to lunch here) is that you run this test with 1000, 100K, 10M, 100M combats and look at the results in this way.
              We need seperate human-only games for MP/PBEM that dont include the over-simplifications required to have a good AI
              If any man be thirsty, let him come unto me and drink. Vampire 7:37
              Just one old soldiers opinion. E Tenebris Lux. Pax quaeritur bello.

              Comment


              • #8
                Vulture,

                I should have defined my terms better: to me "one combat" is one battle involving alot of units where I am on the offensive. And usually they involve sacking an AI Civ's city...

                You described exactly what I suspect happens with the random number generation function:

                "Problems can arise when the limited number of strings that can actually be generated are not 'evenly' distributed amongst the theoritcally possible strings (in the example above, with only 1/256 of the possible 4 number sequence actually being produced by the PRNG, it is possible that the strings generated are clustered in certain parts of the phase space"

                What you have described corresponds exactly to the sequences I see repeated though out the game. Not being a programmer by trade or being able to remember much from the statistical analysis course at university, I don't know much about how they coded this RNGFn but there are two ways that I know of: first start with the same seed each time or two, use the previous outcome as the seed for the next.

                Your code above uses the second method and since there is some severe truncation of the outcome, and the results match what we have experienced, what can we do to fix it ?

                In game play, I tend to use obselete units to soap off the first bad result or two and then commit the regular troops to the assault. Saves me the frustration of loosing a cavalry regiment attacking a division of longbowmen in open ground...

                Do you have any suggestions about which test results would be useful so you could compare them to yours? I was looking at recording the terrain, the units attack vs defense, number of hp of each unit and the number of turns the combat takes to end. From that I should be able to calculate the percentage chance for the observed outcome using a combat results calculator I have from another thread, and should be able to write a small batch file to do the work for me.

                BTW, Vulture, what do you do for a living ? You write like a career engineer or mathematician.


                D.
                "Not the cry, but the flight of the wild duck,
                leads the flock to fly and follow"

                - Chinese Proverb

                Comment


                • #9
                  Vulture,

                  To explain the whining over this thing: the perception is driven by the timing of the obscene results.

                  These results usually happen at the very start of a game, where despite a promising starting location, you loose all of your archers attacking warriors in open ground, and your enemey's warrior takes your capital despite being defended by a spearman on the walls and behind a river, or when the AI Civ benfits when you are trying to take out that one last infantryman in a target city, and you are down to your last unit while the rest of your forces are destroyed or have minimal hit points.

                  Too many times you see the defender down to his last hit point and your unit still has full hit points, only to have your unit destroyed before combat ends, and the next unit to attack will destroy the defender, even if the defender has healed to full strength again. Hence my impression of the streak.

                  Our passions and comprehension of reality are too easily offended when the improbable occurs in a game like this one. I just have to keep reminding myself all the time that it's just a game.


                  D.
                  "Not the cry, but the flight of the wild duck,
                  leads the flock to fly and follow"

                  - Chinese Proverb

                  Comment


                  • #10
                    Originally posted by Gen.Dragolen
                    Vulture,

                    I should have defined my terms better: to me "one combat" is one battle involving alot of units where I am on the offensive. And usually they involve sacking an AI Civ's city...
                    Ah, the best kind of combat

                    Do you have any suggestions about which test results would be useful so you could compare them to yours? I was looking at recording the terrain, the units attack vs defense, number of hp of each unit and the number of turns the combat takes to end. From that I should be able to calculate the percentage chance for the observed outcome using a combat results calculator I have from another thread, and should be able to write a small batch file to do the work for me.

                    BTW, Vulture, what do you do for a living ? You write like a career engineer or mathematician.
                    Actually I'm an astrophysicist... (but it's not as bad as it sounds).

                    After pondering this on the way back from the restaurant on the bus, a possibility has occured to me. As I said, I do see what seem to be certain patterns in the combat results. One pattern seems to be that if a defender wins an unlikely combat taking minimal damage, it does seem very likely to win the next one as well. Also, sequence of units losing hps alternatively seem to happen a lot (although you'd expect them to under a lot of circumstances).

                    So one interesting test would be to look at sequences of the order in which hps are lost (e.g. veteran tank attacks veteran rifleman, results are ADDAAA - A means the attacker won a round, D means the defender, and at the end of the combat the tank is alive and on 2 hp). If there is anything screwy going on, some of these strings (either for one combat or N consecutive combats) should be over-represented compared to the random probabilities. (Aside, SpencerH's point seems to be that some of the strings can be over-represented but that the overall distribution of strings is such that the probability distribution I present earlier is unaffected - I'm not convinced that this can be the case, but since it's not obviously false then we can't dismiss the possibility yet; I was working earlier in the belief that any bias in the string distribution would show up in the prob. tables, 'cos that seems intuitive to me).

                    I'll go away and implement this test with the PRNG that I gave earlier, and see what turns up.

                    Comment


                    • #11
                      One pattern that I've convinced myself that I see (whether or not it exists) is that the the first defender in a city has some extra bonus ie the first defender seems more likely to cause damage to attackers than subsequent defenders, even if the defending units appear identical - has anyone else seen this?
                      "An Outside Context Problem was the sort of thing most civilisations encountered just once, and which they tended to encounter rather in the same way a sentence encountered a full stop" - Excession

                      Comment


                      • #12
                        deleted
                        Seemingly Benign
                        Download Watercolor Terrain - New Conquests Watercolor Terrain

                        Comment


                        • #13
                          Originally posted by vulture
                          One pattern seems to be that if a defender wins an unlikely combat taking minimal damage, it does seem very likely to win the next one as well.
                          That would require the PRNG to "know" that the call was for such a specific game event. The PRNG "knows" nothing about the purpose to which the result will be used. Only if the programmer made the result of the first round affect succeeding rounds would that be relevant. In other words, even if the PRNG were streaky, the streaks would be randomly applied to each round. So sometimes the streak would start at the beginning of the unit's combat, other times not.

                          People see patterns where none exist. They see faces in the clouds and animal shapes in the stars. That's just the way the mind works.

                          Comment


                          • #14
                            Originally posted by vulture
                            So one interesting test would be to look at sequences of the order in which hps are lost (e.g. veteran tank attacks veteran rifleman, results are ADDAAA - A means the attacker won a round, D means the defender, and at the end of the combat the tank is alive and on 2 hp). If there is anything screwy going on, some of these strings (either for one combat or N consecutive combats) should be over-represented compared to the random probabilities.
                            There is a small rounding error in the attack modifers, factored in after the PRNG. This could slightly change the ratio of A/D. However, it would not increase streakiness, just the ratio.

                            (By the way, good job on the analysis. )

                            Comment


                            • #15
                              Originally posted by vulture
                              Actually I'm an astrophysicist... (but it's not as bad as it sounds).
                              What do you mean? Astrophysics sounds bad?
                              Are you a practical or theoretical physisist? I just love physics, especially astro-.

                              BTW, nice work with the RNG. It's nice to know that it's just the limits of programming and not a bug or AI cheat causing these streaks.
                              I AM.CHRISTIAN

                              Comment

                              Working...
                              X