Data compression
Building a data compression ecosystem
Compression algorithms are used in a vast number of protocols and file formats throughout all of computing. Implemented in C, these libraries encounter regular security issues despite receiving extensive industry-wide scrutiny.
Our initiative aims to create memory-safe implementations of compression libraries:
- zlib: a widely-used compression library, used primarily on the web to provide gzip compression to the text/html/js/css we send around.
- bzip2: a file compression program that is widely deployed and supported e.g. as part of zip.
- xz: a compression format that provides very good compression, but is comparatively slow. Commonly used for large file downloads.
- zstd: a modern successor to zlib, providing better compression faster.
What We've Done
For zlib, we've created an initial implementation based on zlib-ng, called zlib-rs
, with a focus on maintaining excellent performance while introducing memory safety. The initial development of zlib-rs
was started and partly funded by Prossimo.
In April 2024, an early release of zlib-rs was integrated in flate2. In Nov 2024 an audit by ISRG was succesfully completed, and optimizations for Webassembly were included in a new release.
The development of bzip2
, the 2nd project in this initiative, started Oct 2024. Unlike in zlib-rs
we will use c2rust
to translate the original bzip2 C code to Rust. A first release is expected in Feb 2025.
What's Next
We're currently seeking funding to complete work necessary to make the initial implementation of zlib
ready for production and to start work on xz and zstd.
The high level goals for the four projects are:
- provide on-par performance with C/C++ counterparts
- provide a dynamic library that is a drop-in replacement, but has compiled memory-safe rust code inside
- dramatically reduce attack surface through memory safety, improved tooling and a robust build system
- provide a pure rust implementation to rust users that integrates with the existing ecosystem
Project status "Data compression"
status | funding target | funded |
---|---|---|
in progress | € 495.000 | 12% |
Work plan
For per project details, see the workplan.
Links
Blog and news
-
The fastest WASM zlib
WASM has its own SIMD instructions these days. We know that SIMD is incredibly effective for the zlib algorithms, and were excited to use the WASM SIMD instructions. Read more ... -
Trifecta Tech Foundation is the new home for memory safe zlib
Zlib-rs, an open source memory safe implementation of zlib, has a new long-term home at the Trifecta Tech Foundation Read more ... -
Current zlib-rs performance
A crucial aspect of making zlib-rs successful is solid performance. In this post we'll see how the implementation performs today, and how we measure that performance. Read more ... -
flate2 release v1.0.29 with new `zlib-rs` feature
With the new zlib-rs feature, a new backend is enabled that brings in a SIMD-accelerated Rust implementation. Read more ... -
xz incident shows the need for structural change
At Sovereign Tech Fund, we're following the xz incident closely and listening to the many voices in the FOSS maintainer community. Read more ...