Photo of Joe T. Sylve

Joe T. Sylve, Ph.D.

Digital Forensic Researcher and Educator

Transparent Compression (DECMPFS)

APFS supports transparent file compression through the DECMPFS (Decompression File System) framework, shared with HFS+. Compressed files appear normal to applications but store their data in a compressed form on disk, significantly reducing space usage on system volumes. This post covers the on-disk format, compression types, and how to parse compressed files.

Overview

A compressed file is identified by the UF_COMPRESSED BSD flag set in its inode record. When this flag is present, the file’s actual data is stored in either an extended attribute named com.apple.decmpfs (for small files) or in the file’s resource fork (for larger files). The kernel transparently decompresses data on read, so applications never see the compressed form.

The decmpfs_disk_header

The com.apple.decmpfs extended attribute begins with a fixed header:

#define DECMPFS_MAGIC 0x636d7066 // 'cmpf'

typedef struct {
    uint32_t compression_magic;  // 0x00
    uint32_t compression_type;   // 0x04
    uint64_t uncompressed_size;  // 0x08
    uint8_t attr_bytes[];        // 0x10
} decmpfs_disk_header;

The maximum size of the entire com.apple.decmpfs extended attribute is 3802 bytes. If the compressed data exceeds this limit, it must be stored in the resource fork.

Compression Types

Type Algorithm Location Notes
1 None xattr Small files stored uncompressed inline
3 zlib xattr Small zlib-compressed files
4 zlib resource fork Larger zlib-compressed files
5 Dataless none Data fetched on demand (iCloud/network)
7 LZVN xattr Fast LZ77 variant (macOS 10.9+)
8 LZVN resource fork Larger LZVN files
9 None xattr Uncompressed data stored inline
10 None resource fork 64KB chunks, uncompressed
11 LZFSE xattr High-efficiency entropy-coded (macOS 10.11+)
12 LZFSE resource fork Larger LZFSE files
13 LZBITMAP xattr Block bitmap compression
14 LZBITMAP resource fork Larger LZBITMAP files

Odd-numbered types (3, 7, 9, 11, 13) store data inline in the extended attribute. Even-numbered types (4, 8, 10, 12, 14) store data in the resource fork.

Dataless Files

Special compression types represent files whose content is not stored locally:

#define DATALESS_CMPFS_TYPE     0x80000001
#define DATALESS_PKG_CMPFS_TYPE 0x80000002

These are placeholders for iCloud-synced or network-mounted content. The metadata (size, permissions) exists locally, but the data is fetched on demand.

Parsing a Compressed File

  1. Check the UF_COMPRESSED flag (bit 5 of bsd_flags in j_inode_val_t).
  2. Read the com.apple.decmpfs extended attribute from the File System Tree.
  3. Verify compression_magic equals DECMPFS_MAGIC.
  4. Read compression_type to determine the algorithm and data location.
  5. Locate the compressed data:
    • Inline (types 1, 3, 7, 9, 11, 13): Data follows the header in attr_bytes.
    • Resource fork (types 4, 8, 10, 12, 14): Data is in the com.apple.ResourceFork extended attribute.
  6. Decompress using the appropriate algorithm.

Resource Fork Chunking

Resource fork compression types split data into 65,536-byte (64 KB) chunks, each compressed independently. The number of chunks is ceil(uncompressed_size / 65536). Two chunking schemes exist, selected by compression_type alone (not by any field in the resource fork):

Fixed-offset scheme (Type 4, zlib)

The resource fork begins with a 256-byte classic HFS+ resource-fork header whose big-endian data_offset equals 0x104 (struct decmpfs_rsrc_chunk_table_fixed). At fixed resource-fork offset 0x104 a little-endian uint32_t chunk count is stored, and at 0x108 a chunk table follows: one 8-byte entry per chunk, each a [offset, length] pair of little-endian uint32_ts. Each chunk’s offset is relative to 0x104, so chunk i’s compressed bytes start at resource-fork byte 0x104 + offset[i] and run for length[i] bytes. The resource fork ends with a constant 50-byte HFS+ resource-map trailer.

Absolute-offset scheme (Types 8, 10, 12, 14)

There is no resource-fork header, no stored chunk count, and no cmpf resource map. At resource-fork byte 0, an array of num_chunks + 1 little-endian 4-byte absolute offsets begins (struct decmpfs_rsrc_chunk_table_abs), where num_chunks = ceil(uncompressed_size / 65536). Read offsets[i] at byte 4 * i. Chunk i spans [offsets[i], offsets[i+1]), so its compressed length is offsets[i+1] - offsets[i]. The final entry offsets[num_chunks] equals the total resource-fork size.

Stored chunks

In both schemes a chunk that would not shrink under compression is stored uncompressed, flagged by a marker as its first byte. When a chunk’s first byte equals the algorithm’s marker, skip that byte and copy the remaining length - 1 bytes verbatim; otherwise decompress with the type’s algorithm. The markers are zlib/LZFSE/LZBITMAP = 0xFF, LZVN = 0x06, and none (type 10) = 0xCC.

Interaction with APFS

When the kernel hides extended attributes from userland for compressed files:

This means forensic tools accessing raw APFS structures will see these attributes, but tools going through the VFS layer will not. The INODE_HAS_UNCOMPRESSED_SIZE flag (0x40000) in internal_flags indicates the inode’s uncompressed_size field is valid.

On sealed volumes, compressed data integrity is verified through the sealed volume’s hash tree. The apfs_verify_uncompressed_data function checks decompressed blocks against expected hashes.

Forensic Considerations

Conclusion

DECMPFS provides transparent, per-file compression that is deeply integrated into APFS through extended attributes and resource forks. Understanding the compression types and chunking schemes is essential for any tool that needs to access file contents on APFS volumes, particularly system volumes where compression is the default.

Find an issue or technical inaccuracy in this post? Please file an issue so that it may be corrected.