Announcement

**pdescobar** · July 8, 2003, 00:12

I take a similar approach to your #2 in perl. Note that the hardcoding of the sections does not actually seem like that big a deal to me because they normally try to maintain forward compatibility. What I do, algorithmically is this:

Read the first 4 bytes to see if it is BIC/BICX. If not, run it through the decompressor and try again. If still not, give up.
Read the next 4 bytes. If those correspond to one of the 27 sections I think I can handle (e.g. VER# or BLDG but not ZZJY) I hand it off to a generic parser function telling that function what the section ID is. If they don't correspond, I spit out a warning but hand it off to the generic parser anyhow. Since the generic parser always gets the next data, I could probably just read it there, but this is legacy from my original code which was similar to option 1 and I'm not changing it if it still works
The Generic Parser is now in control. It reads the 4 bytes for the number of things to process and then loops over them, reading the length of the entry and then calling a specific parser to handle the details of the entry. (e.g. if this is the BLDG section, a parseBLDG() function is called during each loop iteration storing the first 64 bytes as a 'description' string, the next 32 bytes as a 'name' string, etc.). The specific parsers return the number of bytes they actually processed and the Generic parser then error-checks this, spitting out warnings on mismatches and shoving any unused data into an 'excess' area of the data structure for that entry. The generic parser makes a valiant attempt at handling unknown sections by trying to simply store the entire data length as a hex string in a manner similar to the excess data section on known sections.
The specific parsers are pretty well hardcoded. They assume the data is in a given format and parse it accordingly. They should probably check the given entry length to make sure they don't try to read more than is there, but it's currently not that robust. In a half-dozen or fewer cases where I know of major differences between version 4.01 and 11.18 I make checks against the VER# entry to see what version this is and handle it accordingly.

Several assumptions are made here. I assume the VER# section has already been handled, I assume that all BICs are essentially the same as well as all BIXes, etc. These kinds of shortcuts can be taken because the userbase is small. 99% of the things I parse are going to be created by the editor and I know what to expect from it; most people who have programs that write BICS are going to try and keep the format as close to that produced by the editor (and posted here) as well to be sure that the game engine reads it correctly.

As for data storage, perl allows me great flexibility and I use that to my advantage. The entire BIC file is stored in a single hash. The first level entries are one for each section type. The second level entries are a count of how many things of this type I have and an individual entry for each thing. The third level then are broken up by data type simply to make it easier for me to read the values out on a printed dump of the structure, the fourth level is essentially a variable name and the fifth level is the actual data; normally this is a scalar but in some situations this is a hash as well. So, if I want to know how many BLDG definitions there are, I check $BIC{'BLDG'}{'count'} and if I want to know what the movement cost of the first terrain type is I check $BIC{'TERR'}{0}{'value'}{'movement_cost'}.

I think option two makes far more sense than option 3 because you are less likely to run into problems. The only way the format will change is if a game patch changes it. These are infrequent occurrances and odds are you will know about them and be able to adjust if necessary. Here's an admittedly contrived example to show the problem with this option. Say I add a Technocrat citizen and I just slap TECH into the name entries on it as a placeholder. You now run the possibility of mistakenly thinking that's the start of the TECH section. It seems to me, the odds of that happening are far more likely than the odds of the section definitions changing so completely that well-written option 2 code breaks.

If they really mess with the BIC format in the future (like for example when they removed some initial unknown values in CULT) I will simply adjust to it and/or cut off support for the old versions. I don't think you really need all that much flexibility; it's not like you're getting a patch a week with format changes. I am quite willing to only support vanilla version 1.29f and tailor my PTW support to the latest patch as well, handling small inconsistencies with previous versions as they show up and are reported. Granted, my scripts will have a far smaller user base once they are released than something like Gramphos' Multitool, but I think the same general approach would apply.

**Gramphos** · July 8, 2003, 12:34

In C3MT the approch is like your #2, with the addition that it ensures that it always reads the the length of a section. That way it handles many versions by just stop reading when the length is over, or skipping bytes at the end when the length isn't reached, but all known data is loaded.

However, the system I use to load the data isn't optimized for the filestructure, and can't handle a large expansion of the format. Therefore I'm planning to recreate the BIC/BIX handling part of my tool. But I'm currently following the concept: "If it works, don't change it." However, I think I'll have to change it by the time Conquest appear, so I should probably start by building up a new system before it arrives.

One of the problems that I deal with is that the file format isn't very VB friendly...

**vovan** · July 8, 2003, 21:22

Thanks for the responses, pdescobar and Gramphos.

So, it looks like there isn't any particularly easy way to have the flexibilty in the code, yet, it also seems there isn't THAT much need for it. Well, I guess then I can leave my code for loading the BIC alone for now.

**Gramphos** · July 11, 2003, 17:41

Yes. I don't think we will see any big changes to the BIC format before Conquest gets out. But by then I'll probably need to have my code changed to a little more flexible way of loading, or the time to make C3MT compatible with Conquest might be extra long, or I'll have to do some really bad workarounds to be able to keep building on my current system, which got close to it's limits by the release of PTW.

**BlueWlvrn** · July 12, 2003, 20:44

In my "tinker" program, I am also using perl.

I took the approach of using perl's OO programming capabilities and created a Base class that handles all of the basic file I/O.

What I do is create an array of format strings. These format strings are the same strings used by pack and unpack. So you'll end up with an array something like ("a4", "V", "V","C","v")... If you know perl, these will make sence. ;-) Then using this array, I read in the appropriate number of bytes from the file, and then unpack it into another array at the same position.

So in the derived classes, this format array is defined as well as any "access" functions for particular data fields.

I currently only have the terrain section completed, as I was tinkering with the concept of a new map generation algorithm.

I would imagine, with C++ you could do something similiar.

There are really only a small number of data types.

fixed length text
4 byte integer (little endian, intel/vax order)
2 byte integer (little endian)
1 byte int/char (I treat them as unsigned for now)

I think that was it.

**pdescobar** · July 12, 2003, 22:51

BlueWlvrn: I am pleasantly surprised to see that someone else is using perl to play around with the BICs

It sounds like you've got a "cleaner" method for dealing with the data than I use though as it'll be a real pain for me to add writing the BIC data back out (9,000,000 pack statements would be added to undo my 9,000,000 unpacks with my current method.

)

To go a bit off-topic, how are you handling resource allocation on your map generator? The standard game allocator confuses me greatly

**BlueWlvrn** · July 13, 2003, 14:29

Originally posted by pdescobar

To go a bit off-topic, how are you handling resource allocation on your map generator? The standard game allocator confuses me greatly

I haven't actually gotten that far. I was mostly tinkering with the landscape first. (And coming up w/a clean way to read and write the BIC)

I haven't done much with it in a week or so.

Announcement

Reading the BIC file: a programmer's perspective

Reading the BIC file: a programmer's perspective

Comment

Comment

Comment

Comment

Comment

Comment

Comment