SQL files are highly repetitive, I have been using bzip2 (pbzip2) for over 20 years, but is it really a sane choice in 2025 ?
We already know about gzip (even older than bzip2), it is super fast, but the compression is not great, so the options in question are “zstd” and “xz (LZMA2)” and “brotli”
zstd -19 -T12 myfile.bin (-19 for maximum compression (5 for low) T12 for 12 CPUs), original file is kept by default
unzstd myfile.bin.zst
xz -k -T0 -9 myfile.bin (Auto threads, high compression, -k for keeping original file)
Compressing a 200GB SQL file on an intel i7-4930K using 8/12 threads
zstd at compression level 19 (MAX) = 20.63% ( 40 GB)
xz at compression level 9 (Max) = 38GB
So, xz at max compression took much longer to finish, but resulted in a 2GBs smaller file….
zstd at compression level 15 seems to be the sweet spot, I will redo the experiments soon and bring in more precise results on what compression and how much time.
There is not much to it. the tar command piped into any compression program of your choice.
For speed, rather than using gzip, you can use pigz (to employ more processors / processor cores), or pbzip2 which is slower but compresses more
cd to the directory where your folder is in
then
tar -c mysql | pbzip2 -vc > /hds/dbdirzip.tar.bz2
for more compression
tar -c mysql | pbzip2 -vc -9 > /hds/dbdirzip.tar.bz2
for more compression and to limit CPUs to 6 instead of 8, or 3 instead of 4, or whatever you want to use, since the default is to use all cores
tar -c mysql | pbzip2 -vc -9 -p6 > /hds/dbdirzip.tar.bz2
tar cvf – mysql | pigz -9 > /hds/dbdirzip.tar.gz
Or to limit the number of processors to 6 for example
tar cvf – mysql | pigz -9 -p6 > /hds/dbdirzip.tar.gz
Now, if you want to compress a single file to a different directory
pbzip2 -cz somefile > /another/directory/compressed.bz2
I am getting old, my brain aint all that bright no more, for example i am so used to using
mysqldump –opt -u root –password=”LaooIa12@Hsu” mytodolist | gzip > mytodolist.bin.gz
So to compress the database while dumping it without having it hit the disk first in what is called piping.
the problem is that i didn’t think of checking whether pbzip2 supports this so that i can use all 6 CPU cores (12 vCPUs)
So, turns out that pbzip2 (parallel BZIP) does work
mysqldump –opt -u root –password=”LaooIa12@Hsu” mytodolist | pbzip2 -vc > mytodolist.bin.bz2
Now, i can get it done much much faster, 6-12 times faster