Decompressing Grids in save files

Started by Spitfire, November 18, 2019, 02:11:13 PM

Previous topic - Next topic

Spitfire

Hey guys!

I'm trying to decompress the grids in the save files, especially the terrainGrid top.

What I've gathered from these old forum posts:
https://ludeon.com/forums/index.php?topic=1083.0
https://ludeon.com/forums/index.php?topic=37928.0
https://ludeon.com/forums/index.php?topic=37089.0

My understanding:
The map is data in base 64, where one value corresponds to one square.
This data is compressed with DEFLATE.
To make the save file grid readable, one needs to:

  • Convert the base64 data into binary.
  • Decompress the binary using INFLATE.
  • Convert the decompressed binary into base64.
  • Correspond each value to a map feature.

Let me demonstrate the problem I run into.

The terrainGrid for flat land is:
<terrainGrid>
<topGridDeflate>
7c5RCQAhFAAwuCo2OcFEYgR7mMZafokRnuBYgZW/Nm4xc/QAAAAAAAAAAAAAAAAAgG306AEAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMCLvhQ94Bh9AQ==
</topGridDeflate>


First, convert to binary:
11101101110011100101000100001001000000000010000100010100000000000011000010111000001010100011011000111001110000010100010001100010000001000111101110011000110001100101101001111110100010010001000110011110111000000101100010000001100101011011111100110110011011100011000101110011111101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000011011011111010011101000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000110000001000101110111110000101000011110111100000000110000111110100000001

Second, decompress the binary with python
import zlib
with open("binary.txt", "rb") as binary_file:
    data = binary_file.read()
decompressed_data = zlib.decompress(data)


The result is:
Error -3 while decompressing data: incorrect header check

And I understand why!

Every Grid in every save file I found starts with "7". "7" in Base64 has binary 111011, according to the Base64 Wikipedia entry. So every Grid in binary has "11" as its second and third bit. According to the DEFLATE Wikipedia entry, each Deflate-stream starts with a three-bit header. Second and third bit "11" means "reserved, don't use". So it's no wonder I get an "incorrect header check".

I have found no clues about Rimworld-Savegame using a special header. According to neitsa:
QuoteIt's a "simple" deflate algorithm, nothing fancy. I guess it can be opened with common tools that handle the ZIP format.
QuoteThe compression is just a zlib compression, without specific headers.

My question is:
How can I decompress the grids in the save files.

Thank you for reading and thinking along!




I looked into jamessimo's code of Rimmap, and he started with the first two steps as well. But I'm not familiar with Java. I'm not really a programmer, I was hoping to do everthing in Python. But I can't recreate jamessimo's code.

decompress(rawGrid) {
  //TOPGRID
  //DECODE BASE 64 TO BINARY
  let binary = atob(rawGrid);
  let output = [];
  //INFLATE/DECOMPRESS TOPGRID
  try {
    output = pako.inflate(binary, {
      raw: true
    });
  } catch (err) {
    console.log(err);
  }
  return (this.delaceArray(output));
}


Step one: let binary = atob(rawGrid);   I can follow here.
Step two: output = pako.inflate(binary, {raw: true})   I can't follow this step. pako.min.js is too confusing for me to understand what's happening.

Somehow, jamessimo managed to decompress the grids in the save files. To quote XKCD: "Who were you, jamessimo? What did you see?!"

ForeverZer0

You are confusing some terms, which is messing you up. By "binary" it does not mean try to convert it so you can view it in "binary notation", it means "raw data".  It is simply compressed, then uses base64 on the data so that it can be written to an XML file, as base64 does not use any characters that are forbidden in XML.


  • Load the data as a string
  • Decompress it into raw bytes with whatever base64 decode method you are using
  • Deserialize those bytes into the respective game objects

I am not a fan of Python, so not entrirely sure which methods/API to tell you to use specifically, more familiar with C/C++, C#, and Ruby if any of those would help you to understand with example.

Spitfire

Thank you for your answer!

QuoteDecompress it into raw bytes with whatever base64 decode method you are using

That's the step I still don't understand, as the problem of an incorrect header still persists.

Quotemore familiar with C/C++, C#, and Ruby if any of those would help you to understand with example.

Yes, an example would help very much. I think C/C++ will work.

ForeverZer0

Quote from: Spitfire on November 19, 2019, 02:30:31 AM
Thank you for your answer!

QuoteDecompress it into raw bytes with whatever base64 decode method you are using

That's the step I still don't understand, as the problem of an incorrect header still persists.

Base64 does not use a header, so you should not be getting an error decoding it.


Quotemore familiar with C/C++, C#, and Ruby if any of those would help you to understand with example.

This is some pseudo-code using what I recall off the top of my head from Ruby, it might not be exact.

require 'base64'
require 'zlib'

# Assume that "xml_string" is the string that you read from the file, a bunch of random characters

# "decoded" will still be a bunch or raw data, still compressed
decoded = Base64.decode64(xml_string)

# "uncompressed" will still be raw data, but is now uncompressed, and ready to be deserialized into an object
uncompressed = Zlib::Inflate.inflate(decoded)


Like I said, not super familiar with Python, but also make sure that any strings you are using are not being encoded into a text encoding (ASCII, UTF8, etc), you want them all to be byte strings (i.e. raw, without encoding).