An interesting feature of ZFS is that it supports transparent compression. Different to typical file compression, ZFS compression works on the record size/block size that it writes (which is variable in ZFS depending on the data and file size itself). Since it is important to have a fast compression/decompression algorithm to reduce the overhead compared to file access without compression, it can not be expected to get compression results similar to for example bzip in its highest compression level. Also, the data files of the LHC experiments are ROOT files which already store data in a compressed format.

Therefore, I was not expecting any benefit of enabling compression on our servers, but since the newly implemented algorithm

LZ4 has nearly no overhead even for non-compressible data, it shouldn't hurt to enable it. Especially since our storage servers have Dual-CPUs with 12 cores each, running most of the time idle.

After enabling the default lz4 compression on 4 machines that were already migrated to ZFS and copying data on it, the first compression result looks like this:

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank-2TB 32.5T 8.73T 23.8T - 15% 26% 1.00x ONLINE -
tank-8TB 116T 24.0T 92.0T - 10% 20% 1.00x ONLINE -
NAME PROPERTY VALUE SOURCE

tank-2TB compressratio 1.03x -

tank-2TB/gridstorage01 compressratio 1.03x -

tank-2TB/gridstorage02 compressratio 1.03x -

tank-2TB/gridstorage03 compressratio 1.03x -

tank-2TB/gridstorage04 compressratio 1.03x -

tank-8TB compressratio 1.03x -

tank-8TB/gridstorage01 compressratio 1.03x -

tank-8TB/gridstorage02 compressratio 1.03x -

tank-8TB/gridstorage03 compressratio 1.03x -

tank-8TB/gridstorage04 compressratio 1.03x -

tank-8TB/gridstorage05 compressratio 1.04x -

tank-8TB/gridstorage06 compressratio 1.03x -

tank-8TB/gridstorage07 compressratio 1.03x -

tank-8TB/gridstorage08 compressratio 1.03x -

tank-8TB/gridstorage09 compressratio 1.03x -

tank-8TB/gridstorage10 compressratio 1.03x -

tank-8TB/gridstorage11 compressratio 1.03x -

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT

tank-2TB 32.5T 8.45T 24.0T - 11% 26% 1.00x ONLINE -

tank-8TB 116T 24.1T 91.9T - 7% 20% 1.00x ONLINE -

NAME PROPERTY VALUE SOURCE

tank-2TB compressratio 1.03x -

tank-2TB/gridstorage01 compressratio 1.03x -

tank-2TB/gridstorage02 compressratio 1.03x -

tank-2TB/gridstorage03 compressratio 1.03x -

tank-2TB/gridstorage04 compressratio 1.04x -

tank-8TB compressratio 1.03x -

tank-8TB/gridstorage01 compressratio 1.03x -

tank-8TB/gridstorage02 compressratio 1.03x -

tank-8TB/gridstorage03 compressratio 1.03x -

tank-8TB/gridstorage04 compressratio 1.03x -

tank-8TB/gridstorage05 compressratio 1.03x -

tank-8TB/gridstorage06 compressratio 1.03x -

tank-8TB/gridstorage07 compressratio 1.03x -

tank-8TB/gridstorage08 compressratio 1.03x -

tank-8TB/gridstorage09 compressratio 1.03x -

tank-8TB/gridstorage10 compressratio 1.03x -

tank-8TB/gridstorage11 compressratio 1.03x -

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT

tank-4TB 127T 9.05T 118T - 3% 7% 1.00x ONLINE -

NAME PROPERTY VALUE SOURCE

tank-4TB compressratio 1.03x -

tank-4TB/gridstorage01 compressratio 1.03x -

tank-4TB/gridstorage02 compressratio 1.03x -

tank-4TB/gridstorage03 compressratio 1.04x -

tank-4TB/gridstorage04 compressratio 1.02x -

tank-4TB/gridstorage05 compressratio 1.03x -

tank-4TB/gridstorage06 compressratio 1.03x -

tank-4TB/gridstorage07 compressratio 1.03x -

tank-4TB/gridstorage08 compressratio 1.03x -

tank-4TB/gridstorage09 compressratio 1.03x -

tank-4TB/gridstorage10 compressratio 1.04x -

tank-4TB/gridstorage11 compressratio 1.03x -

tank-4TB/gridstorage12 compressratio 1.02x -

tank-4TB/gridstorage13 compressratio 1.03x -

tank-4TB/gridstorage14 compressratio 1.03x -

NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT

tank-2TB 63.5T 15.4T 48.1T - 11% 24% 1.00x ONLINE -

NAME PROPERTY VALUE SOURCE

tank-2TB compressratio 1.03x -

tank-2TB/gridstorage01 compressratio 1.03x -

tank-2TB/gridstorage02 compressratio 1.04x -

tank-2TB/gridstorage03 compressratio 1.03x -

tank-2TB/gridstorage04 compressratio 1.03x -

tank-2TB/gridstorage05 compressratio 1.03x -

tank-2TB/gridstorage06 compressratio 1.03x -

tank-2TB/gridstorage07 compressratio 1.03x -

Although there is not much data stored so far on each of the machines, this means we can still reduce the used disk space by some percent, 2-4% here depending on the file system and the data on it.

We have a bit more than 1PB disk storage in total on our site and the servers with 2TB disks provide about 50TB usable storage each. If we can get 4% compression for all the data, that would mean we could get nearly the space provided by one of the 2TB-disk servers additionally for free, without the cost of a new machine, power, extra disks,.... ! And that's just with the default compression while the compression level could also be tuned in ZFS...

This saving could be even bigger if we consider that in the future sites will also store more non-LHC data, like for LSST, which use a different and maybe uncompressed file format.

Another positive aspect of compression is that it reduces disk I/O since it needs to read less data blocks from disk.

It will be interesting to see how the compression rate will be after all our servers have been switch over to ZFS.