Lzip has been designed, written, and tested with great care to be the standard general-purpose compressor for unix-like systems. In this page you can find some (totally unscientific[1]) tests comparing (de)compression speeds and sizes of gzip, bzip2, and lzip. In short, lzip is the perfect replacement for most uses of gzip and bzip2. It can be about as fast as gzip or can compress more than bzip2 (but not at the same time).
Lzip is probably the best compressor for local online documentation (texinfo manuals and man pages). It produces, on average, compressed texinfo manuals a 19% smaller and man pages a 6% smaller than gzip without noticeable differences in decompression speed. It also requires very little memory to decompress. For example, 'lzip.1.lz' is decompressed on my machine in 2 ms using 82 kB of RAM.
In the tests below, times are measured compressing or decompressing from RAM to /dev/null on an idle machine and taking the best of three trials.
The compressors tested are: gzip-1.11 bzip2-1.0.6 lzip-1.23 The files tested are: cantrbry.tar gcc-4.7.2.tar gmp-5.0.1.tar hawaii-c Digital elevation map (DEM) of Hawaii from the USGS database, "HAWAII - C HI NE05-01W" icecat-3.5.3-x86.tar solfege-3.14.6.tar
The bzip2 format does not store neither the uncompressed nor the compressed size of each block in the file. Therefore it can't list the file sizes. Bzip2 does not even provide a '--list' option.
The gzip format only stores the uncompressed size truncated to 32 bits. Therefore it can only list accurately files with an uncompressed size smaller than 4 GiB, and with at most one non-empty member. In all other cases, the uncompressed size reported is wrong:
$ gzip-1.11 --list 4GiBzeros.gz compressed uncompressed ratio uncompressed_name 4168175 0 0.0% 4GiBzeros
Latest versions of the gzip tool try to overcome the limited uncompressed size in the format by making '--list' decompress the file and report the full sizes. This works, and for small files it is not very inefficient, but for files of about 4 GiB of uncompressed size it is about 10_000 times slower than reading the sizes from the member trailer (21 seconds instead of 2 ms):
$ gzip-1.13 --list 4GiBzeros.gz compressed uncompressed ratio uncompressed_name 4168175 4294967296 99.9% 4GiBzeros
The lzip format provides a distributed index with 64-bit fields allowing it to efficiently print correct uncompressed and compressed sizes even for multimember files. Lzip provides an efficient and reliable '--list' option:
$ lzip --list 2x4GiBzeros.lz 4GiBzeros.lz uncompressed compressed saved name 8589934592 1211910 99.99% 2x4GiBzeros.lz 4294967296 605955 99.99% 4GiBzeros.lz 12884901888 1817865 99.99% (totals)
Gzip decompresses most files faster than lzip, but lzip is fast enough that when reading from, or writing to, storage media the speed difference is significantly reduced. Thanks to its distributed index, multimember lzip files can be decompressed in parallel, allowing plzip to decompress as fast as gzip when plzip uses two processors, or faster if more processors are available. (This test does not preload the compressed files into RAM):
$ ls -go linux-libre-3.12.5-gnu.tar* -rw-r--r-- 1 535347200 Dec 12 2013 linux-libre-3.12.5-gnu.tar -rw-r--r-- 1 112399638 Dec 12 2013 linux-libre-3.12.5-gnu.tar.gz -rw-r--r-- 1 74330266 Dec 12 2013 linux-libre-3.12.5-gnu.tar.lz -rw-r--r-- 1 75127916 Dec 12 2013 linux-libre-3.12.5-gnu-mm.tar.lz $ time gzip -t linux-libre-3.12.5-gnu.tar.gz real 0m5.295s user 0m3.580s sys 0m0.050s $ time lzip -t linux-libre-3.12.5-gnu.tar.lz real 0m7.227s user 0m7.080s sys 0m0.070s $ time plzip -t linux-libre-3.12.5-gnu-mm.tar.lz real 0m5.642s user 0m8.450s sys 0m0.220s
The following table shows that "lzip -0" is comparable both on compression ratio and compression speed with gzip's default compression level. Lzip decompression is slower here that in the bzip2 table below because it speeds up with compression ratio.
file cantrbry gcc gmp hawaii-c icecat solfege size 2821120 529940480 12687360 9840640 32419840 15964160 gzip size 739064 107838136 2652995 1772672 12085988 3606713 time 0.212s 23.112s 0.568s 0.843s 2.457s 0.556s time -d 0.023s 3.457s 0.087s 0.067s 0.305s 0.097s lzip -0 size 589704 106353465 2607594 1595903 11234174 3321985 time 0.144s 22.265s 0.543s 0.384s 2.196s 0.699s time -d 0.057s 8.939s 0.221s 0.149s 0.921s 0.284s
Bzip2, having an algorithm very different from those of gzip and lzip, is more difficult to match. "lzip -3" seems to be the closest replacement for "bzip2 -9", even if variations between the two are notable. Note that lzip decompresses about 3 times faster than bzip2.
file cantrbry gcc gmp hawaii-c icecat solfege size 2821120 529940480 12687360 9840640 32419840 15964160 bzip2 -9 size 570856 82994239 2006109 708873 11162963 3047512 time 0.383s 2m10s 2.882s 6.712s 7.488s 7.859s time -d 0.138s 27.838s 0.673s 0.548s 2.430s 0.801s lzip -3 size 519202 86981371 2063379 1261624 10070541 2811227 time 0.760s 1m55s 2.716s 1.712s 13.197s 2.576s time -d 0.054s 7.710s 0.184s 0.122s 0.848s 0.256s
Lzip goes beyond gzip and bzip2 on compression ratio. Here is the complete range of compressed sizes produced from the files above.
file cantrbry gcc gmp hawaii-c icecat solfege size 2821120 529940480 12687360 9840640 32419840 15964160 gzip -9 736221 106713313 2632726 1574440 12041750 3561394 bzip2 -9 570856 82994239 2006109 708873 11162963 3047512 lzip -0 589704 106353465 2607594 1595903 11234174 3321985 lzip -1 583538 100356610 2393196 1664538 10881157 2977186 lzip -2 554104 94105269 2254242 1564685 10529621 2900473 lzip -3 519202 86981371 2063379 1261624 10070541 2811227 lzip -4 498387 78981585 1877795 1133250 9596939 2758507 lzip -5 488380 73417637 1748853 905509 9323190 2625566 lzip -6 486875 68613820 1679403 749030 9118578 2583494 lzip -7 484132 63126266 1653092 710414 9046260 2477311 lzip -8 482663 61649754 1642662 702582 8980468 2462646 lzip -9 481413 60880185 1639086 700891 8975976 2455262
Xz has a complex format, partially specialized in the compression of executables and designed to be extended by proprietary formats. Of the four compressors tested here, xz is the only one alien to the Unix concept of "doing one thing and doing it well". It is inadequate for long-term archiving and inadvisable for data sharing and for free software distribution. If you are distributing software in xz format, please consider using lzip instead. See Xz format inadequate for long-term archiving
Don't interpret me wrong. I am very grateful to Igor Pavlov for inventing/discovering LZMA, but xz is the third attempt of his followers to take advantage of the popularity of 7-zip and replace gzip and bzip2 with inappropriate or poorly designed formats. In particular, it is regrettable that support for lzma-alone was implemented in both GNU and Linux.
But some users have asked about how does lzip compare with xz, so I have added some tests.
I have downloaded the latest version of all the projects I could find in ftp.gnu.org that are distributing xz tarballs but are not yet distributing lzip tarballs. Below is a directory listing containing the downloaded tar.xz files and their tar.lz versions produced with "lzip -9". [Note: since 2015-07-08 I just add or remove projects to this list as needed. Keeping all projects updated to the latest version is too much work.]
Total size of tar.lz files = 228_324_089 bytes. Total size of tar.xz files = 233_953_780 bytes.
More than 2% of bandwidth could be saved if only the maintainers of these projects changed their automake setting from "dist-xz" to "dist-lzip".
Note that each and every one of the 50 tar.lz files is smaller than its tar.xz version. The case of glibc-2.20.tar.lz is specially interesting. Lzip compressed it better than xz in spite of xz being invoked with the non-standard "--extreme" option and using twice the RAM as lzip.
-rw-r--r-- 1 1209917 Apr 25 2012 autoconf-2.69.tar.lz -rw-r--r-- 1 1214744 Apr 25 2012 autoconf-2.69.tar.xz -rw-r--r-- 1 611691 Mar 20 2016 autoconf-archive-2016.03.20.tar.lz -rw-r--r-- 1 613612 Mar 20 2016 autoconf-archive-2016.03.20.tar.xz -rw-r--r-- 1 1014605 Aug 30 2014 autogen-5.18.4.tar.lz -rw-r--r-- 1 1017936 Aug 30 2014 autogen-5.18.4.tar.xz -rw-r--r-- 1 1485345 Dec 24 2013 automake-1.14.1.tar.lz -rw-r--r-- 1 1488984 Dec 24 2013 automake-1.14.1.tar.xz -rw-r--r-- 1 584053 Mar 30 2013 barcode-0.99.tar.lz -rw-r--r-- 1 586028 Mar 30 2013 barcode-0.99.tar.xz -rw-r--r-- 1 183037 May 12 2015 bool-0.2.2.tar.lz -rw-r--r-- 1 183576 May 12 2015 bool-0.2.2.tar.xz -rw-r--r-- 1 521980 Oct 11 2011 cflow-1.4.tar.lz -rw-r--r-- 1 526880 Oct 11 2011 cflow-1.4.tar.xz -rw-r--r-- 1 793511 Aug 1 2013 combine-0.4.0.tar.lz -rw-r--r-- 1 794716 Aug 1 2013 combine-0.4.0.tar.xz -rw-r--r-- 1 399832 Nov 2 2013 complexity-1.1.tar.lz -rw-r--r-- 1 401220 Nov 2 2013 complexity-1.1.tar.xz -rw-r--r-- 1 5364984 Jul 19 2014 coreutils-8.23.tar.lz -rw-r--r-- 1 5375612 Jul 19 2014 coreutils-8.23.tar.xz -rw-r--r-- 1 513690 Mar 16 2013 cppi-1.18.tar.lz -rw-r--r-- 1 515664 Mar 16 2013 cppi-1.18.tar.xz -rw-r--r-- 1 1423335 Mar 4 2012 dico-2.2.tar.lz -rw-r--r-- 1 1445224 Mar 4 2012 dico-2.2.tar.xz -rw-r--r-- 1 1192566 Mar 24 2013 diffutils-3.3.tar.lz -rw-r--r-- 1 1197832 Mar 24 2013 diffutils-3.3.tar.xz -rw-r--r-- 1 34000192 Mar 11 2013 emacs-24.3.tar.lz -rw-r--r-- 1 35565352 Mar 11 2013 emacs-24.3.tar.xz -rw-r--r-- 1 1887433 Aug 29 2019 findutils-4.7.0.tar.lz -rw-r--r-- 1 1895048 Aug 29 2019 findutils-4.7.0.tar.xz -rw-r--r-- 1 1464600 Jun 2 2010 gcal-3.6.tar.lz -rw-r--r-- 1 1516104 Jun 2 2010 gcal-3.6.tar.xz -rw-r--r-- 1 75482104 Jul 5 2017 gcc-6.4.0.tar.lz -rw-r--r-- 1 76156220 Jul 5 2017 gcc-6.4.0.tar.xz -rw-r--r-- 1 13824725 Mar 4 2012 gcide-0.51.tar.lz -rw-r--r-- 1 14343984 Mar 4 2012 gcide-0.51.tar.xz -rw-r--r-- 1 17512274 Jul 29 2014 gdb-7.8.tar.lz -rw-r--r-- 1 17664316 Jul 29 2014 gdb-7.8.tar.xz -rw-r--r-- 1 582706 Jun 4 2019 gengetopt-2.23.tar.lz -rw-r--r-- 1 584860 Jun 4 2019 gengetopt-2.23.tar.xz -rw-r--r-- 1 12267027 Sep 7 2014 glibc-2.20.tar.lz -rw-r--r-- 1 12283992 Sep 7 2014 glibc-2.20.tar.xz -rw-r--r-- 1 14275212 Jan 1 2013 gnu-ghostscript-9.06.0.tar.lz -rw-r--r-- 1 15659620 Jan 1 2013 gnu-ghostscript-9.06.0.tar.xz -rw-r--r-- 1 702944 Mar 23 2014 gnu-pw-mgr-1.2.tar.lz -rw-r--r-- 1 705448 Mar 23 2014 gnu-pw-mgr-1.2.tar.xz -rw-r--r-- 1 1232290 Jun 3 2014 grep-2.20.tar.lz -rw-r--r-- 1 1237196 Jun 3 2014 grep-2.20.tar.xz -rw-r--r-- 1 5058491 Jun 28 2012 grub-2.00.tar.lz -rw-r--r-- 1 5136412 Jun 28 2012 grub-2.00.tar.xz -rw-r--r-- 1 1134733 Oct 3 2015 gslip-1.0.2.tar.lz -rw-r--r-- 1 1136628 Oct 3 2015 gslip-1.0.2.tar.xz -rw-r--r-- 1 925244 Aug 12 2014 gtypist-2.9.5.tar.lz -rw-r--r-- 1 929356 Aug 12 2014 gtypist-2.9.5.tar.xz -rw-r--r-- 1 722360 Jun 10 2013 gzip-1.6.tar.lz -rw-r--r-- 1 725084 Jun 10 2013 gzip-1.6.tar.xz -rw-r--r-- 1 157458 Jul 26 2014 help2man-1.46.1.tar.lz -rw-r--r-- 1 158796 Jul 26 2014 help2man-1.46.1.tar.xz -rw-r--r-- 1 997137 Feb 3 2012 idutils-4.6.tar.lz -rw-r--r-- 1 1001496 Feb 3 2012 idutils-4.6.tar.xz -rw-r--r-- 1 613281 Sep 6 2018 indent-2.2.12.tar.lz -rw-r--r-- 1 620280 Sep 6 2018 indent-2.2.12.tar.xz -rw-r--r-- 1 1327183 Jan 13 2014 inetutils-1.9.2.tar.lz -rw-r--r-- 1 1331608 Jan 13 2014 inetutils-1.9.2.tar.xz -rw-r--r-- 1 4206085 Nov 8 2019 libredwg-0.9.2.tar.lz -rw-r--r-- 1 4582968 Nov 8 2019 libredwg-0.9.2.tar.xz -rw-r--r-- 1 854695 Oct 18 2011 libtool-2.4.2.tar.lz -rw-r--r-- 1 868760 Oct 18 2011 libtool-2.4.2.tar.xz -rw-r--r-- 1 1860152 Jul 8 2015 libunistring-0.9.6.tar.lz -rw-r--r-- 1 1960488 Jul 8 2015 libunistring-0.9.6.tar.xz -rw-r--r-- 1 1144715 Sep 22 2013 m4-1.4.17.tar.lz -rw-r--r-- 1 1149088 Sep 22 2013 m4-1.4.17.tar.xz -rw-r--r-- 1 2059431 Sep 8 2010 mailutils-2.2.tar.lz -rw-r--r-- 1 2268636 Sep 8 2010 mailutils-2.2.tar.xz -rw-r--r-- 1 1071213 Mar 13 2013 mpfr-3.1.2.tar.lz -rw-r--r-- 1 1074388 Mar 13 2013 mpfr-3.1.2.tar.xz -rw-r--r-- 1 1145822 Jul 16 2011 myserver-0.11.tar.lz -rw-r--r-- 1 1176472 Jul 16 2011 myserver-0.11.tar.xz -rw-r--r-- 1 1407848 Mar 31 2017 nano-2.8.0.tar.lz -rw-r--r-- 1 1413796 Mar 31 2017 nano-2.8.0.tar.xz -rw-r--r-- 1 1591986 Jul 29 2014 parted-3.2.tar.lz -rw-r--r-- 1 1655244 Jul 29 2014 parted-3.2.tar.xz -rw-r--r-- 1 672538 Sep 12 2012 patch-2.7.tar.lz -rw-r--r-- 1 674544 Sep 12 2012 patch-2.7.tar.xz -rw-r--r-- 1 593559 Jul 7 2010 rush-1.7.tar.lz -rw-r--r-- 1 600248 Jul 7 2010 rush-1.7.tar.xz -rw-r--r-- 1 1161654 Jan 4 2017 sed-4.3.tar.lz -rw-r--r-- 1 1167168 Jan 4 2017 sed-4.3.tar.xz -rw-r--r-- 1 1082263 Oct 19 2013 sharutils-4.14.tar.lz -rw-r--r-- 1 1089052 Oct 19 2013 sharutils-4.14.tar.xz -rw-r--r-- 1 3443366 Apr 8 2013 smalltalk-3.2.5.tar.lz -rw-r--r-- 1 3513508 Apr 8 2013 smalltalk-3.2.5.tar.xz -rw-r--r-- 1 2052694 Dec 17 2017 tar-1.30.tar.lz -rw-r--r-- 1 2108028 Dec 17 2017 tar-1.30.tar.xz -rw-r--r-- 1 222604 Jun 12 2013 teseq-1.1.tar.lz -rw-r--r-- 1 223360 Jun 12 2013 teseq-1.1.tar.xz -rw-r--r-- 1 3960770 Jun 26 2015 texinfo-6.0.tar.lz -rw-r--r-- 1 4086712 Jun 26 2015 texinfo-6.0.tar.xz -rw-r--r-- 1 326754 Jan 14 2017 vc-dwim-1.8.tar.lz -rw-r--r-- 1 327492 Jan 14 2017 vc-dwim-1.8.tar.xz
Because "xz -9" uses a dictionary size twice as large as "lzip -9" (and twice as large as "lzma -9"). This makes it appear as if xz could compress large files a little more than lzip. But if you pass to lzip the arguments equivalent to those of "xz -9" (or to xz the arguments equivalent to those of "lzip -9"), lzip will usually compress more than xz:
linux-libre-3.12.5-gnu.tar (size 535347200) "lzip -m64 -s64MiB" 74192464 9m16s "xz -9" 74306080 9m 7s "lzip -9" 74330266 10m53s "xz --lzma2=nice=273,dict=32MiB" 74563636 10m15s
Note that using plain "-9" on both compressors, lzip usually compresses large files about as much as xz, but using half the RAM and requiring half the RAM to decompress.
(This test was made using lzip-1.19 and xz-5.2.1).
If your unxz applet in busybox seems to decompress faster than the lunzip applet, it may be because you are trying to decompress standard xz files (those produced by default by xz-utils), whose integrity the unxz applet can't verify. Creating a xz file with the correct check type for the unxz applet (CRC32) usually makes it decompress slower than lunzip:
"busybox unxz -t linux-libre-3.12.5-gnu.tar.xz" 8.331s "busybox lunzip -t linux-libre-3.12.5-gnu.tar.lz" 8.714s "busybox unxz -t linux-libre-3.12.5-gnu.tar.crc32.xz" 9.723s
Note that error detection in the xz format is silently broken. Both xz-utils and the unxz applet ignore the recommendations of the xz format specification. Xz-utils uses by default an optional check type (CRC64) in the files it produces, preventing decompressors that do not support the optional check types from verifying the integrity of the data. The unxz applet does not warn if it finds an unsupported check type, which greatly increases the probability of corruption going unnoticed. It is unsafe to decompress standard xz files with busybox; even unsafer than decompressing lzma-alone files. Corruption in compressed LZMA2 packets is detected about as unsafely as in lzma-alone, but the integrity of the uncompressed LZMA2 packets can't be verified at all, making corruption undetectable in a potentially large fraction of the file.
For example, sampling the files in the first test above with unzcrash[2] at 16 KiB intervals has found that unxz does not detect corruption in the following fractions of each file:
automake-1.14.1.tar.xz 4% combine-0.4.0.tar.xz 15% emacs-24.3.tar.xz 0.7% gcc-6.4.0.tar.xz 0.2% gcide-0.51.tar.xz 25% gnu-ghostscript-9.06.0.tar.xz 10% grub-2.00.tar.xz 3% octave-4.0.0.tar.xz 3.5% texinfo-6.0.tar.xz 3%
A quick search revealed some more xz files unsafe for busybox:
cairo-1.14.6.tar.xz 43% firefox-47.0.1.source.tar.xz 14% firefox-kde-opensuse-47.0.1-1-x86_64.pkg.tar.xz 26% gimp-2.8.18-i586-1.txz 1.4% gtk+-3.21.4.tar.xz 1.4% libvorbis-1.3.5.tar.xz 9.6% linux-3.16.35.tar.xz 0.3% MPlayer-1.2_20160125-i586-3.txz 1.3% php-7.0.9.tar.xz 5% Python-3.5.2.tar.xz 4.6% ruby-2.3.1.tar.xz 2.6%
And here is a quick test of the lack of safe interoperability between xz-utils and xz-embedded (busybox):
# First create a xz file containing an uncompressed LZMA2 chunk. $ echo 'The quick brown fox jumps over the lazy dog.' | xz > fox.xz # Now open fox.xz with an hex editor and modify any character in the # sentence above (which xz stores uncompressed). When you try to # decompress the modified file you'll notice that xz-utils detects the # corruption, but busybox's xz does not: $ xz -t fox.xz xz: fox.xz: Compressed data is corrupt $ echo $? 1 $ busybox unxz -t fox.xz $ busybox unxz -cd fox.xz The quick brown fox jumps over the lazy fog. $ echo $? 0
(This test was made using busybox-1.25.0 with lzip support).
[1] Paraphrasing John von Neumann, there's no sense in being precise when you don't even know what kind of hardware or compiler will use the person reading this. But in case you need a reference, this test was run on an AMD Athlon 64 X2 Dual Core Processor 5200+ running in 64 bit mode, and lzip was compiled out of the box with gcc-6.1.0.
[2] The unzcrash tool is included in the lziprecover package.
Copyright © 2024 Antonio Diaz Diaz.
You are free to copy, modify, and distribute all or part of this article without limitation.
Updated: 2024-01-20
This page does not use javascript.