Lossless Data Compression for Extremely Big Data - Planetary Artificial Intelligence

I want to create a planetary sized artificial intelligence environment. He will imitate the underground life in a very large world. According to Wikipedia, planet Earth has an area of ​​510,072,000 km2, I want to create a square with similar proportions, maybe more. I will store one meter on each bit, where 0 means dirt and 1 means a wall of dirt.

First, we figure out how to save one row of this square. One line will be 510072000000m, and each byte can store 8 meters, so one line will be 59.38GB, and the whole world will be 3.44PB. And I would like to add at least water and lava for every square meter, which will multiply the results by 2.

I need to compress this information using lossless compression algorithms. At first I tried a very direct approach with 7zip, and I tried it with a smaller world where one line would be 6375B. Theoretically, the world should be 6375 ^ 2B = 38.76 MB, but when I try it, I get a file of size 155 MB, I do not know why this difference. But when I compress it using 7Zip, I get a 40.1MB file. This is a huge difference, and with this ratio I would convert my world 3.44PB file to 912.21GB file.

My first thought: why do I have such a large file when math tells me that it should be smaller? Maybe the problem is in the code, maybe the problem is that I had errors in math. The code is as follows: (C #)

// 510072000000m each line = 63759000000B
const long SIZE = 6375;

// Create the new, empty data file.
string fileName = tbFile.Text;

FileStream fs = new FileStream(fileName, FileMode.Create);

// Create the writer for data.
BinaryWriter w = new BinaryWriter(fs);

// Use random numbers to fill the data
Random random = new Random();
// Write data to the file.
for (int i = 0; i < SIZE; i++)
{
    for (int j = 0; j < SIZE; j++)
    {
        w.Write(random.Next(0,256));
    }
}

w.Close();

fs.Close();

And the math is so fundamental that if I did something wrong, I don’t see it.

? , - , , , .

.

+5
2

@Scharron , , :

. . , AI , , , PB.

, , , , @Scharron, 3 , .

+2

C#, , 4 (6375 * 6375 * 4 MB = 155 ). , Write 32- .

+3
source

All Articles