This article explains how INI sections and tags are stored in MiniINI and compares it to more obvious ideas of how to implement INI data storage. Reading this article isn't necessary to learn to use MiniINI. It should explain MiniINI workings to those who would wish to understand its workings or to modify it.
The main purpose of INIFile is to provide access to INI data and make sure the user can't manipulate it directly to prevent errors. INIFile handles creation and deletion of INISections and any resulting errors so even larger changes to this code won't break the interface. Other than that it's just a simple container, storing INISections in an array of pointers.
Normally, to implement INI data storage, one could use something like this:
class INISection { private: std::map<std::string, std::string> TagValues; };
This is a very simple implementation to code and might be "good enough" in most cases. However, there is actually quite large overhead here.
Let's look at memory usage. Each tag is stored as two std::strings. On a 64bit machine, you can usually expect std::string to contain at least the following data members:
32bit unsigned UsedBytes; 32bit unsigned AllocatedBytes; 64bit pointer String;
We also have to count the trailing zero character, which adds one more byte. As each tag uses two C++ strings, this adds up to 34 bytes for every single tag in the loaded INI file. Also, as the actual string data is in dynamically allocated buffers, we also have to count with some bookkeeping bytes to increase this even further. Now, your typical INI tag, such as:
tagname=value
might have about 10-20 bytes of data on average. This means that the size of stored INI data is usually more than doubled.
If you want to support tags with multiple values, such as:
tagname=val1,val2,val3
this can get even worse, as you'd have to use something like:
std::map<std::string, std::vector<std::string>> TagValues;
In this case, even for tags with just one value, you can expect 16 more bytes which is usually the size of std::vector. So 50+ bytes per, say, 16 bytes of useful data. Also, the data that is actually useful can be spread all over the memory, which won't be good for cache performance.
Allocating all these little blocks of memory also takes a lot of time, and STL classes do a lot of copying and reallocation when you use them, which will slow down the process of actually parsing and accessing the data significantly.
Of course, this won't affect you if you're using 20kiB INI files on a modern PC, but could provide quite a hit if you're coding for a platform with less resources or using INI files to store large amounts of data.
The first thing that comes to mind is to minimize usage of STL classes. For instance, just getting rid of std::strings should save a lot of memory. INISection could be reimplemented as follows:
class INISection { struct TagValue { unsigned Size; char ** TagAndValues; }; private: std::vector<TagValue> TagValues; };
where TagAndValues is a pointer to a buffer of pointers, first of which points to the tag name, and others point to values (we suport multi value tags here). Size is number of pointers held by the TagAndValues buffer.
Memory overhead for each tag would be at least 4 + 8 + Size * (8 + 1) bytes, or 30+ bytes per tag. This is a lot better than 50+ bytes in the previous case, but we can still expect INI data size to double. As we're now forced to handle memory manually, we can also cut down on STL's reallocation and copying, at the cost of considerably uglier code.
We also haven't solved the greatest problem of previous implementation, which is the fact that INI data is spread over a large number of very small buffers, each of which requires an 8-byte pointer and 1-byte trailing zero character, plus unknown number of bookkeeping bytes for the dynamically allocated memory. Still, cache performance should be a bit better simply because of smaller data size.
If you measure memory usage of MiniINI (using tools such as Massif), you might notice that it usually uses about as much memory as the size of files it loads. So how does MiniINI get rid of all that overhead we mentioned?
If you look at INISection implementation in MiniINI, it might not be very descriptive. Even though it supports tags with multiple values, you'd see something similar to this:
class INISection { private: unsigned Length; char ** Tags; };
There is apparently just a single dynamically allocated array of strings. So where did the vector of arrays of strings go?
The answer lies in how MiniINI stores tags. In MiniINI, each tag with all of its values is stored in a single buffer. Normally, we expect to store just one string in a buffer. However, thanks to the fact that C/C++ recognizes a zero character as the end of the string, we can actually have multiple strings in a single buffer. Each tag in MiniINI consists of tag name, followed by a trailing zero, followed by a variable number of tag values, each followed by a zero character, and the last value is followed by two zero characters. So a tag like:
tag=val1,val2,val3
would be encoded like this:
"tag\0val1\0val2\0val3\0\0"
This removes the ability to directly access each value, which would seemingly slow down the implementation. This is not necessarily the case, however. If we're reading a single value tag, the cost of skipping the first string is insignificant compared to the time it took to search for the tag with requested name. It is also insignificant to time that would be spent on allocation of extra buffers in the previous case. When reading multiple values from a single tag, we must skip one string per every value, instead of allocating one buffer for every value at load. Decreased memory use and improved cache performance helps as well.
In this case, memory overhead per tag is 8 + (number of values + 1) + 1. For a tag with one value, this is 11 bytes. That's a lot less than previous 30+. Noticed the lack of the + ? That's because there is one last important thing about how MiniINI stores data, the Allocator class.
Even if we implement the optimizations shown above, we still have a lot of small buffers in different memory locations, each needing to be separately allocated and deleted and each with its bookkeeping data. The Allocator class minimizes this overhead.
Allocator itself is a pretty straightforward piece of code. Each INIFile has a single Allocator that is created when file is loaded, and asks Allocator to allocate an area of memory (usually roughly the size of loaded file). Allocator allocates this memory in a few large blocks (usually about 16) and any INISections created request memory from Allocator instead of allocating it themselves. If there is no space left, a new block is allocated with sufficient size and when all memory in a block is not used anymore, that block is deleted. This way we reduce previously needed 1 new/delete pair per tag to about 16 new/delete pairs overall, saving a lot of computation time, getting rid of most bookkeeping data overhead and greatly improving cache performance since the data is close together.