Apologies for not posting anything in a while. Anyway, I agree that there are two discussions going on here, one about the technical feasibility of different options, and another one about what's really better in a gameplay sense. I'll try to adress the former first.
The tradeoff between speed and memory that's been discussed is not equivalent to a decision between tiles and areas. The possible speed problem I see with Vet's (sorry, I just can't use the second person form after all this time! It would be weird...) compression scheme is that it really attempts to minimize the memory, and this means that the computer has to uncompress the data to a usable form when processing it, and then compress the results back into very small space. This can be alleviated by making the internal representation of the data "partially compressed" so that the conversion between compressed and uncompressed data is either very fast, or that parts of the uncompressed data can be used directly without conversion. This would happen in the expense of memory, but then again, "there is no such thing as free lunch". For example, one of the biggest obstacles that Vet just shrugs off is the data about adjacent regions. I just can't see any reasonably fast way of extracting that information from the data structure he proposed. Hence, they'd have to be stored separately, adding several bytes to the memory consumption estimate for an area.
I made a few calculations of my own, by the way. The purpose of these is to illustrate approximately where tiles might be more appropriate way of storing information, and when the opposite is true. I've thought about four possible levels of compression and named them min, med, max and uncompressed. The names refer to memory consumption so that min is the smallest possible, and uncompressed the largest (but perhaps fastest). The two first are basicly describing regions that are based on hexagonal tiles (just like Vet's example used rectangular tiles), and the other two are for truly free-form map. But more about that later.
The basis for these compression schemes is that each area consists of
Properties + Adjacent area references + Origin + Border
Properties (denoted by capital P from now on) are the properties of the area in a layer (I am not defining layers here... I will give examples of different values for P). The references to adjacent areas are 2-byte numbers (based on an assumption that there are no more than 65536 areas in a layer), and this is not a constant number. For the purposes of the examples I've assumed that each area has six adjacent areas (that's the case with hexagonal grid at least), so A always equals 2*6=12 bytes. Origin on the other hand always consists of (x,y)-coordinates in the map, and presumably each coordinate can be represented with 2 bytes (giving resolution of 65536*65536, enough for out purposes). So O is always 4 bytes. So, the number of bytes required to store an area is
Size = P+12+4+B = 16+P+B
The length of border is dependant on the shape and size of an area. To make my estimates, I can fork the number from up and down by assuming the worst and best case scenarios: in worst case, every area is a very thin strip only one hex wide (when hexes are used), in the best case they'd all be hexagonal (in case of hex map) or close to circular (in free-form vector map). If N is the number of tiles that are used, then
Length of border = 4*N+2 (in hexagonal map, worst case)
or
L = 6*sqrt((4*N-1)/3)) (hex map, best case)
With free-form vectors the comparison is a little bit more difficult, so I just passed it for now.
Min:
This compression scheme is pretty much the same as Vet proposed, except in hexes: each edge of a hexagon is represented by two bits (four values). If you look at a hexagonal grid, you notice that every intersection of edges has only 3 ways to go to... the fourth value is used to denote the end of the border (thus, no need to check if it has reached the origin). There are 8 bits in a byte, so the space reserved by border will be
B = L/4
It follows that:
S = 16 + P + L/4
In worst case this means that
S = 16 + P + (4*N+2)/4
To determine what is the minimum size an area should be to make it as memory efficient as tiles, we'll just have to compare this value to the size that it would take to store N number of properties individually:
(in my notation "=<" means "smaller or equal to" and ">=" is "greater or equal to". "=>" is just an implication.)
S =< N*P
=>
16 + P + (4*N+2)/4 =< N*P (multiply by to get rid of fractional 1/2)
=>
33 + 2P + 2N =< 2NP (skipping the *-sign, I think it's redundant)
=>
2NP - 2N = N(2P - 2) >= 2P + 33 (divide by 2P - 2, it must be that P > 1)
=>
N >= (2P + 33)/(2P - 2)
Now, let's consider the different values of P:
P=1: For example, elevation can be represented adequately with one byte. I'm sure there are a lot of examples of properties that take one byte or less, but the point that the above calculation shows that there is no size of an area where the areas would be guaranteed to take less space than tiles. Remember that this is the worst case!
P=2: N >= (2*2+33)/(2*2-2) = 37/2 = 19 (rounding up). So the "diversity", that is the average size of an area/region has to be at least 19 to give undoubted benefit to any 2-byte property being stored in an area instead of tiles. Far from Vet's estimate above, but one has to consider that this is the absolute maximum of tiles, being the "worst case". As for practical examples, I think that population density is something that requires two bytes to be realistic.
P=3: N >= (2*3+33)/(2*3-2) = 39/4 = 10.This is getting better, but the properties that conceivably take 3 bytes are hard to imagine, unless we count combinations like "age, occupation, religion". And combinations are prone to be very diverse.
P=4: N >= (2*4+33)/(2*4-2) = 41/6 = 7.
P=5: N >= (2*5+33)/(2*5-2) = 43/8 = 6.
And so on. Now, as for the best case scenario where every area is approximately a hexagon, the comparison when areas take less space than tiles is a little bit more complex:
S =< NP
=>
16 + P + (6sqrt((4N-1)/3))/4 =< NP (multiply to get rid of fractions)
=>
32 + 2P + 3sqrt((4N-1)/3)) =< 2NP (toss the square root to one side)
=>
3sqrt((4N-1)/3)) =< 2NP - 2P - 32 (raise to second power)
=>
9((4N-1)/3)) =< 4(P^2)(N^2) - (8(P^2) + 128P)N + (4(P^2) + 128P + 1024)
=>
12N - 3 =< 4(P^2)(N^2) - (8(P^2) + 128P)N + (4(P^2) + 128P + 1024)
=>
0 =< 4(P^2)(N^2) - (8(P^2) + 128P + 12)N + (4(P^2) + 128P + 1027)
This looks rather tricky due to the parameter P... therefore I decided to just substitute different values of P and then solve the second degree equation by guessing.
P=1: 0 <= 4(N^2) - (8 + 128 + 12)N + (4 + 128 + 1027) = 4(N^2) - 148N + 1159
One can see that if N = 25 the equation is false, but if N = 26 it is true. So the minimum size of an area representing a one byte property is 26 tiles, even in the best possible case. In reality such situation would never exist, therefore this result can be interpreted so that if the diversity of a 1-byte property is less than 26, then it will never be beneficial no matter what the area shapes are.
P=2: 0 =< 16(N^2) - (32 + 256 + 12)N + (16 + 256 + 1027) = 16(N^2) - 300N + 1299
It seems that the equation is true if N >= 12. Combined with the earlier result that in the worst case the diversity of a 2-byte property has to be at least 19, we can conclude that if the areas representing this property are smaller than 12 tiles, then tiles are more efficient, and if the areas are larger than 19 then areas are more efficient way to store the data. Between 12 and 19 the results depend on the shapes of the map and as such cannot be estimated very well. Again, think of population density as a convenient example.
P=3: 0 =< 36(N^2) - (72 + 384 + 12)N + (36 + 384 + 1027) = 36(N^2) - 468N + 1447
The equation is true if N >= 8. Hence the diversity of a 3-byte property has to be at least 8 tiles per area to make the areas even in theory a competitor to tiles. As comparison point to the worst case scenario, if the size is larger than 10 then areas will always be the better choice.
P=4: 0 =< 64(N^2) - (128 + 512 + 12)N + (64 + 512 + 1027) = 64(N^2) - 652N + 1603
This is true if N >= 7. Interestingly, this is the same number that was in the worst case scenario; when the memory needed by the property itself goes up, it matters less what the shapes of regions are. Rounding errors may also have something to do with it.
P=5: 0 =< 100(N^2) - (200 + 640 + 12)N + (100 + 640 + 1027) = 100(N^2) - 852N + 1767
Here N >= 5. Due to rounding errors the number is again smaller than the one on the worst case (6), but that is not very relevant. At least this gives some lower boundaries to the number of tiles that need to belong to a region to make it feasible.
Med:
I can imagine that squeezing data into single bits like earlier may not be the fastest way for algorithms to grasp the boundary information: of the three directions in each intersection they'd have to somehow calculate the next move, and because the directions aren't the same in every intersection (just picture a hexagonal grid to visualize this), it means extra calculation. It would be more convenient if instead of enumerated direction we could directly get an (x,y)-offset. The individual coordinates in this offset would be either 0, 1 or -1... so we can use two bits per a coordinate and four bits per the whole edge. In other words, the border would take twice as much as it did in the "min" compression. Also, there is room for some extra information, since the areas strictly speaking wouldn't have to stick with hexes anymore: there could be eight possible directions to go to, like in civ-style maps.
I'll do my estimates in a similar manner as before, starting with worst case scenario. All that changes is the denominator from 4 to 2 (now we can only put 2 edges in one byte, compared to 4 before).
S =< NP
=>
16 + P + (4N+2)/2 = 17 + P + 2N =< NP
=>
NP - 2N = N(P - 2) >= P + 17 (divide by 2P - 2, P > 2)
=>
N >= (P + 17)/(P - 2)
P=1, P=2: Both cases are in the grey area, and there is no guarantee that using areas will pay off with this compression scheme. I'd say that if we go for the layered view, most areas would fall into this category: only composite properties have need for more bytes.
P=3: N >= (3 + 17)/(3 - 2) = 20. Twice the size which we got with "min" compression.
P=4: N >= (4 + 17)/(4 - 2) = 21/2 = 11. The equivalent value with more efficient compression was 7... so, it appears that the more bytes the property requires, the less difference there is between different compression schemes (as long as the data is stored in areas. It becomes increasingly cumbersome to store the same information in tiles). Makes perfect sense.
P=5: N >= (5 + 17)/(5 - 2) = 22/3 = 8.
Now for the best case scenario.
S =< NP
=>
16 + P + (6sqrt((4N-1)/3))/2 =< NP
=>
16 + P + 3sqrt((4N-1)/3)) =< NP (toss the square root to one side)
=>
3sqrt((4N-1)/3)) =< NP - P - 16 (raise to second power)
=>
9((4N-1)/3)) =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
12N - 3 =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
0 =< (P^2)(N^2) - (2(P^2) + 32P + 12)N + ((P^2) + 32P + 259)
And the solutions for different values of P:
P=1: 0 =< (N^2) - (2 + 32 + 12)N + (1 + 32 + 259) = (N^2) - 46N + 292
The equation is true when N >= 39. I'd say that it is a rather large value for e.g. elevation... but that's just my gut feeling.
P=2: 0 =< 4(N^2) - 84N + 327, N >= 16.
P=3: 0 =< 9(N^2) - 126N + 364, N >= 10.
P=4: 0 =< 16(N^2) - 172N + 403, N >= 8.
P=5: 0 =< 25(N^2) - 222N + 444, N >= 6.
Just as a reminder, these are the sizes of regions that indicate the minimum diversity with which areas might be better than tiles for storing data; under these figures tiles will always be better than areas, from memory consumption point of view.
Max:
The previous compression is already quite handy, but it works well only with hexes (or rectangular tiles). If we want to use a totally free-form polygons, it requires a little bit more memory. One neat way to compress such polygons, however, would be use (x,y)-offsets just like I did with hexes before, but so that the offset can be more than just -1 or 1 in each direction. If we use 4 bits per coordinate, we can have offsets varying from (-7, -7) to (7, 7), and each edge would be a line somewhere in this 15x15 grid. If we decide to use a free-form map, I think this compression could be quite feasible. On the other hand, if we use tiles then it might be a little bit faster to use full bytes instead of groups of 2 or 4 bits, because computers (and programming languages) work with bytes.
I'll estimate this compression the same way I did before, to give some comparison points. The free-form nature of this compression however makes it a little bit silly, because now the comparison to hexes is becoming less meaningful, and some edges are longer than others.
S =< NP
=>
16 + P + (4N+2) = 18 + P + 4N =< NP
=>
NP - 4N = N(P - 4) >= P + 18 (divide by P - 4, P > 4)
=>
N >= (P + 18)/(P - 4)
P=1, 2, 3 or 4: The compression isn't clearly sensible. But then again, the shapes the free-form areas would allow will most likely be more efficient than the worst case scenario in the hexagonal alternative.
P=5: N >= (5 + 18)/(5 - 4) = 23. The size cannot clearly be thought of as "tiles", but perhaps as something equivalent to the size of a tile...
And the best case scenario, which gives more sensible values (because what was best case for hexes may be average for free-form polygons):
S =< NP
=>
16 + P + 6sqrt((4N-1)/3) =< NP (toss the square root to one side)
=>
6sqrt((4N-1)/3)) =< NP - P - 16 (raise to second power)
=>
36((4N-1)/3)) =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
48N - 12 =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
0 =< (P^2)(N^2) - (2(P^2) + 32P + 48)N + ((P^2) + 32P + 268)
And now the interesting part (one can always dream...):
P=1: 0 =< (N^2) - (2 + 32 + 48)N + (1 + 32 + 268) = (N^2) - 82N + 301
The equation is true if N >= 80. This could be the average size of a region, so an equivalent of a 1 million tile map would be about 12500 areas for each 1-byte property. Not too bad.
P=2: 0 =< 4(N^2) - 120N + 336, N >= 26. Even better!
P=3: 0 =< 9(N^2) - 162N + 373, N >= 16.
P=4: 0 =< 16(N^2) - 208N + 412, N >= 11.
P=5: 0 =< 25(N^2) - 258N + 453, N >= 9.
(Yes, I realize that these figures are getting more and more irrelevant to most readers. I'm just trying to come up with some hard numbers that would justify design decisions, and maybe to amuse Vet in the process.
)
No compression:
If we're using fully free-form vectors, the most obvious storage structure is to just store each vertex of the border as an absolute (X,Y)-coordinate. That would take 4 bytes per each vertex (or edge, since there are as many of them). The usefullness of being so spendthrift is that the transformation to visible polygons on the screen would be quite easy. Other than that, I just cannot see the point, because just storing one origin and then a list of offsets is good enough for most algorithms, I think. I was originally going to write a similar estimate to this compression level as I did with ones before, but now I am getting a little bit exhausted to writing and frankly cannot see the point in throwing strange numbers at you.
* * *
There are also other sorts of possibilities, like storing borders separately and then referencing them from areas. I haven't given much thought to that suggestion, but if Vet can come up with some more accurate ideas then I'm of course eager to read. I also didn't take into account here the possibility of having tiles referencing regions, and not storing any geographical data to regions properties... that approach would require that there are very few layers of complex data. It's a little bit difficult to estimate such systems. Well, it is difficult to estimate any system because the models are so premature... we're just poking in the dark here. Anyway, I believe that the system we decide to use shouldn't be too complicated. We seriously don't have any excess of competent programmers, or very motivated ones, so I believe that the safest alternative that all the programmers (me and Vet?) can agree upon is the one that should be chosen, even if it wasn't the theoretically most efficient one.
I was going to write something about the gameplay issues, but now I notice that I don't have the time right now. Thank you for anyone who bothered to read through this drivel. Let me emphasize that this was far too technical, and it doesn't yet address my opinions on whether tiles or areas should actually be used... it's more like a guideline for those issues. I'll be back.
The tradeoff between speed and memory that's been discussed is not equivalent to a decision between tiles and areas. The possible speed problem I see with Vet's (sorry, I just can't use the second person form after all this time! It would be weird...) compression scheme is that it really attempts to minimize the memory, and this means that the computer has to uncompress the data to a usable form when processing it, and then compress the results back into very small space. This can be alleviated by making the internal representation of the data "partially compressed" so that the conversion between compressed and uncompressed data is either very fast, or that parts of the uncompressed data can be used directly without conversion. This would happen in the expense of memory, but then again, "there is no such thing as free lunch". For example, one of the biggest obstacles that Vet just shrugs off is the data about adjacent regions. I just can't see any reasonably fast way of extracting that information from the data structure he proposed. Hence, they'd have to be stored separately, adding several bytes to the memory consumption estimate for an area.
I made a few calculations of my own, by the way. The purpose of these is to illustrate approximately where tiles might be more appropriate way of storing information, and when the opposite is true. I've thought about four possible levels of compression and named them min, med, max and uncompressed. The names refer to memory consumption so that min is the smallest possible, and uncompressed the largest (but perhaps fastest). The two first are basicly describing regions that are based on hexagonal tiles (just like Vet's example used rectangular tiles), and the other two are for truly free-form map. But more about that later.
The basis for these compression schemes is that each area consists of
Properties + Adjacent area references + Origin + Border
Properties (denoted by capital P from now on) are the properties of the area in a layer (I am not defining layers here... I will give examples of different values for P). The references to adjacent areas are 2-byte numbers (based on an assumption that there are no more than 65536 areas in a layer), and this is not a constant number. For the purposes of the examples I've assumed that each area has six adjacent areas (that's the case with hexagonal grid at least), so A always equals 2*6=12 bytes. Origin on the other hand always consists of (x,y)-coordinates in the map, and presumably each coordinate can be represented with 2 bytes (giving resolution of 65536*65536, enough for out purposes). So O is always 4 bytes. So, the number of bytes required to store an area is
Size = P+12+4+B = 16+P+B
The length of border is dependant on the shape and size of an area. To make my estimates, I can fork the number from up and down by assuming the worst and best case scenarios: in worst case, every area is a very thin strip only one hex wide (when hexes are used), in the best case they'd all be hexagonal (in case of hex map) or close to circular (in free-form vector map). If N is the number of tiles that are used, then
Length of border = 4*N+2 (in hexagonal map, worst case)
or
L = 6*sqrt((4*N-1)/3)) (hex map, best case)
With free-form vectors the comparison is a little bit more difficult, so I just passed it for now.
Min:
This compression scheme is pretty much the same as Vet proposed, except in hexes: each edge of a hexagon is represented by two bits (four values). If you look at a hexagonal grid, you notice that every intersection of edges has only 3 ways to go to... the fourth value is used to denote the end of the border (thus, no need to check if it has reached the origin). There are 8 bits in a byte, so the space reserved by border will be
B = L/4
It follows that:
S = 16 + P + L/4
In worst case this means that
S = 16 + P + (4*N+2)/4
To determine what is the minimum size an area should be to make it as memory efficient as tiles, we'll just have to compare this value to the size that it would take to store N number of properties individually:
(in my notation "=<" means "smaller or equal to" and ">=" is "greater or equal to". "=>" is just an implication.)
S =< N*P
=>
16 + P + (4*N+2)/4 =< N*P (multiply by to get rid of fractional 1/2)
=>
33 + 2P + 2N =< 2NP (skipping the *-sign, I think it's redundant)
=>
2NP - 2N = N(2P - 2) >= 2P + 33 (divide by 2P - 2, it must be that P > 1)
=>
N >= (2P + 33)/(2P - 2)
Now, let's consider the different values of P:
P=1: For example, elevation can be represented adequately with one byte. I'm sure there are a lot of examples of properties that take one byte or less, but the point that the above calculation shows that there is no size of an area where the areas would be guaranteed to take less space than tiles. Remember that this is the worst case!
P=2: N >= (2*2+33)/(2*2-2) = 37/2 = 19 (rounding up). So the "diversity", that is the average size of an area/region has to be at least 19 to give undoubted benefit to any 2-byte property being stored in an area instead of tiles. Far from Vet's estimate above, but one has to consider that this is the absolute maximum of tiles, being the "worst case". As for practical examples, I think that population density is something that requires two bytes to be realistic.
P=3: N >= (2*3+33)/(2*3-2) = 39/4 = 10.This is getting better, but the properties that conceivably take 3 bytes are hard to imagine, unless we count combinations like "age, occupation, religion". And combinations are prone to be very diverse.
P=4: N >= (2*4+33)/(2*4-2) = 41/6 = 7.
P=5: N >= (2*5+33)/(2*5-2) = 43/8 = 6.
And so on. Now, as for the best case scenario where every area is approximately a hexagon, the comparison when areas take less space than tiles is a little bit more complex:
S =< NP
=>
16 + P + (6sqrt((4N-1)/3))/4 =< NP (multiply to get rid of fractions)
=>
32 + 2P + 3sqrt((4N-1)/3)) =< 2NP (toss the square root to one side)
=>
3sqrt((4N-1)/3)) =< 2NP - 2P - 32 (raise to second power)
=>
9((4N-1)/3)) =< 4(P^2)(N^2) - (8(P^2) + 128P)N + (4(P^2) + 128P + 1024)
=>
12N - 3 =< 4(P^2)(N^2) - (8(P^2) + 128P)N + (4(P^2) + 128P + 1024)
=>
0 =< 4(P^2)(N^2) - (8(P^2) + 128P + 12)N + (4(P^2) + 128P + 1027)
This looks rather tricky due to the parameter P... therefore I decided to just substitute different values of P and then solve the second degree equation by guessing.
P=1: 0 <= 4(N^2) - (8 + 128 + 12)N + (4 + 128 + 1027) = 4(N^2) - 148N + 1159
One can see that if N = 25 the equation is false, but if N = 26 it is true. So the minimum size of an area representing a one byte property is 26 tiles, even in the best possible case. In reality such situation would never exist, therefore this result can be interpreted so that if the diversity of a 1-byte property is less than 26, then it will never be beneficial no matter what the area shapes are.
P=2: 0 =< 16(N^2) - (32 + 256 + 12)N + (16 + 256 + 1027) = 16(N^2) - 300N + 1299
It seems that the equation is true if N >= 12. Combined with the earlier result that in the worst case the diversity of a 2-byte property has to be at least 19, we can conclude that if the areas representing this property are smaller than 12 tiles, then tiles are more efficient, and if the areas are larger than 19 then areas are more efficient way to store the data. Between 12 and 19 the results depend on the shapes of the map and as such cannot be estimated very well. Again, think of population density as a convenient example.
P=3: 0 =< 36(N^2) - (72 + 384 + 12)N + (36 + 384 + 1027) = 36(N^2) - 468N + 1447
The equation is true if N >= 8. Hence the diversity of a 3-byte property has to be at least 8 tiles per area to make the areas even in theory a competitor to tiles. As comparison point to the worst case scenario, if the size is larger than 10 then areas will always be the better choice.
P=4: 0 =< 64(N^2) - (128 + 512 + 12)N + (64 + 512 + 1027) = 64(N^2) - 652N + 1603
This is true if N >= 7. Interestingly, this is the same number that was in the worst case scenario; when the memory needed by the property itself goes up, it matters less what the shapes of regions are. Rounding errors may also have something to do with it.
P=5: 0 =< 100(N^2) - (200 + 640 + 12)N + (100 + 640 + 1027) = 100(N^2) - 852N + 1767
Here N >= 5. Due to rounding errors the number is again smaller than the one on the worst case (6), but that is not very relevant. At least this gives some lower boundaries to the number of tiles that need to belong to a region to make it feasible.
Med:
I can imagine that squeezing data into single bits like earlier may not be the fastest way for algorithms to grasp the boundary information: of the three directions in each intersection they'd have to somehow calculate the next move, and because the directions aren't the same in every intersection (just picture a hexagonal grid to visualize this), it means extra calculation. It would be more convenient if instead of enumerated direction we could directly get an (x,y)-offset. The individual coordinates in this offset would be either 0, 1 or -1... so we can use two bits per a coordinate and four bits per the whole edge. In other words, the border would take twice as much as it did in the "min" compression. Also, there is room for some extra information, since the areas strictly speaking wouldn't have to stick with hexes anymore: there could be eight possible directions to go to, like in civ-style maps.
I'll do my estimates in a similar manner as before, starting with worst case scenario. All that changes is the denominator from 4 to 2 (now we can only put 2 edges in one byte, compared to 4 before).
S =< NP
=>
16 + P + (4N+2)/2 = 17 + P + 2N =< NP
=>
NP - 2N = N(P - 2) >= P + 17 (divide by 2P - 2, P > 2)
=>
N >= (P + 17)/(P - 2)
P=1, P=2: Both cases are in the grey area, and there is no guarantee that using areas will pay off with this compression scheme. I'd say that if we go for the layered view, most areas would fall into this category: only composite properties have need for more bytes.
P=3: N >= (3 + 17)/(3 - 2) = 20. Twice the size which we got with "min" compression.
P=4: N >= (4 + 17)/(4 - 2) = 21/2 = 11. The equivalent value with more efficient compression was 7... so, it appears that the more bytes the property requires, the less difference there is between different compression schemes (as long as the data is stored in areas. It becomes increasingly cumbersome to store the same information in tiles). Makes perfect sense.
P=5: N >= (5 + 17)/(5 - 2) = 22/3 = 8.
Now for the best case scenario.
S =< NP
=>
16 + P + (6sqrt((4N-1)/3))/2 =< NP
=>
16 + P + 3sqrt((4N-1)/3)) =< NP (toss the square root to one side)
=>
3sqrt((4N-1)/3)) =< NP - P - 16 (raise to second power)
=>
9((4N-1)/3)) =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
12N - 3 =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
0 =< (P^2)(N^2) - (2(P^2) + 32P + 12)N + ((P^2) + 32P + 259)
And the solutions for different values of P:
P=1: 0 =< (N^2) - (2 + 32 + 12)N + (1 + 32 + 259) = (N^2) - 46N + 292
The equation is true when N >= 39. I'd say that it is a rather large value for e.g. elevation... but that's just my gut feeling.
P=2: 0 =< 4(N^2) - 84N + 327, N >= 16.
P=3: 0 =< 9(N^2) - 126N + 364, N >= 10.
P=4: 0 =< 16(N^2) - 172N + 403, N >= 8.
P=5: 0 =< 25(N^2) - 222N + 444, N >= 6.
Just as a reminder, these are the sizes of regions that indicate the minimum diversity with which areas might be better than tiles for storing data; under these figures tiles will always be better than areas, from memory consumption point of view.
Max:
The previous compression is already quite handy, but it works well only with hexes (or rectangular tiles). If we want to use a totally free-form polygons, it requires a little bit more memory. One neat way to compress such polygons, however, would be use (x,y)-offsets just like I did with hexes before, but so that the offset can be more than just -1 or 1 in each direction. If we use 4 bits per coordinate, we can have offsets varying from (-7, -7) to (7, 7), and each edge would be a line somewhere in this 15x15 grid. If we decide to use a free-form map, I think this compression could be quite feasible. On the other hand, if we use tiles then it might be a little bit faster to use full bytes instead of groups of 2 or 4 bits, because computers (and programming languages) work with bytes.
I'll estimate this compression the same way I did before, to give some comparison points. The free-form nature of this compression however makes it a little bit silly, because now the comparison to hexes is becoming less meaningful, and some edges are longer than others.
S =< NP
=>
16 + P + (4N+2) = 18 + P + 4N =< NP
=>
NP - 4N = N(P - 4) >= P + 18 (divide by P - 4, P > 4)
=>
N >= (P + 18)/(P - 4)
P=1, 2, 3 or 4: The compression isn't clearly sensible. But then again, the shapes the free-form areas would allow will most likely be more efficient than the worst case scenario in the hexagonal alternative.
P=5: N >= (5 + 18)/(5 - 4) = 23. The size cannot clearly be thought of as "tiles", but perhaps as something equivalent to the size of a tile...
And the best case scenario, which gives more sensible values (because what was best case for hexes may be average for free-form polygons):
S =< NP
=>
16 + P + 6sqrt((4N-1)/3) =< NP (toss the square root to one side)
=>
6sqrt((4N-1)/3)) =< NP - P - 16 (raise to second power)
=>
36((4N-1)/3)) =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
48N - 12 =< (P^2)(N^2) - (2(P^2) + 32P)N + ((P^2) + 32P + 256)
=>
0 =< (P^2)(N^2) - (2(P^2) + 32P + 48)N + ((P^2) + 32P + 268)
And now the interesting part (one can always dream...):
P=1: 0 =< (N^2) - (2 + 32 + 48)N + (1 + 32 + 268) = (N^2) - 82N + 301
The equation is true if N >= 80. This could be the average size of a region, so an equivalent of a 1 million tile map would be about 12500 areas for each 1-byte property. Not too bad.
P=2: 0 =< 4(N^2) - 120N + 336, N >= 26. Even better!
P=3: 0 =< 9(N^2) - 162N + 373, N >= 16.
P=4: 0 =< 16(N^2) - 208N + 412, N >= 11.
P=5: 0 =< 25(N^2) - 258N + 453, N >= 9.
(Yes, I realize that these figures are getting more and more irrelevant to most readers. I'm just trying to come up with some hard numbers that would justify design decisions, and maybe to amuse Vet in the process.
)No compression:
If we're using fully free-form vectors, the most obvious storage structure is to just store each vertex of the border as an absolute (X,Y)-coordinate. That would take 4 bytes per each vertex (or edge, since there are as many of them). The usefullness of being so spendthrift is that the transformation to visible polygons on the screen would be quite easy. Other than that, I just cannot see the point, because just storing one origin and then a list of offsets is good enough for most algorithms, I think. I was originally going to write a similar estimate to this compression level as I did with ones before, but now I am getting a little bit exhausted to writing and frankly cannot see the point in throwing strange numbers at you.
* * *
There are also other sorts of possibilities, like storing borders separately and then referencing them from areas. I haven't given much thought to that suggestion, but if Vet can come up with some more accurate ideas then I'm of course eager to read. I also didn't take into account here the possibility of having tiles referencing regions, and not storing any geographical data to regions properties... that approach would require that there are very few layers of complex data. It's a little bit difficult to estimate such systems. Well, it is difficult to estimate any system because the models are so premature... we're just poking in the dark here. Anyway, I believe that the system we decide to use shouldn't be too complicated. We seriously don't have any excess of competent programmers, or very motivated ones, so I believe that the safest alternative that all the programmers (me and Vet?) can agree upon is the one that should be chosen, even if it wasn't the theoretically most efficient one.
I was going to write something about the gameplay issues, but now I notice that I don't have the time right now. Thank you for anyone who bothered to read through this drivel. Let me emphasize that this was far too technical, and it doesn't yet address my opinions on whether tiles or areas should actually be used... it's more like a guideline for those issues. I'll be back.






Comment