Joe T. Sylve, Ph.D.

2022 APFS Advent Challenge Day 22 - Retrospective

2022-12-30T00:00:00+00:00

As 2022 ends, so does my APFS Advent Challenge. Deciding at the last minute to write this series of blogs turned out to be even more challenging than expected. Life tends to find a way to complicate things, and December was no exception for me this year. I am glad I stuck with the challenge and hope that the information provided in the series was of some value to you.

Donations

To help keep me honest and support a worthy cause, I pledged to donate $100 to the Ukraine Humanitarian Fund for each day I failed to write a post.

Early on, I decided to challenge’s parameters from posting every day until Christmas to posting every weekday in December. Because that changed the maximum number of posts from 24 to 22, I donated $200 on December 3rd.

I donated an additional $100 per day on days 10 and 19 when my recently diagnosed carpel tunnel syndrome symptoms were especially bothersome.

Because I like round numbers, support the cause, and I’m not sure if today’s post counts, I have donated an additional $100, bringing my total contribution to the fund to $500 for this challenge.

If it is within your means, please donate to help the Ukrainian people. Regardless of your politics, the civilians that have lost everything due to this senseless conflict are blameless and deserving of our support.

What happens next?

I decided to start this blog as part of my resolution to write more in 2023 and share my research. The advent challenge was a good way of kick-starting that effort. I plan on continuing to post, albeit at a much less demanding pace. If there are any topics about APFS or anything else in the realm of digital forensics that you are interested in learning more about, please feel free to reach out to me. I’ve decided to sunset my Twitter account, but you’ll find me active on Mastodon @jtsylve@infosec.exchange.

This post is part of my 2022 APFS Advent Challenge

Every weekday in the month of December, I will attempt to post a blog about APFS internals. For each day that I miss a post, I will donate $100 to support humanitarian aid for the Ukrainian people. If you find value in this series, and would like to support this effort, please consider donating to the GoFundMe. Slava Ukraini! 🇺🇦

2022 APFS Advent Challenge Day 21 - Fusion Containers

2022-12-29T00:00:00+00:00

As we discussed in an earlier post, Apple’s Fusion Drives combine the storage capacity of a hard disk drive (HDD) with the faster access speed of a solid state drive (SSD). The HDD is the primary storage device, and the SSD acts as a cache for recently accessed data. However, the Fusion Drive does not have built-in caching logic, and the operating system treats the two drives as separate storage devices. Apple created Core Storage to support the desired caching capabilities and the ability to pool the storage of each device into a single logical volume. APFS removes the need for Core Storage by having first-class support for this tiered storage model. This post will go into more detail about APFS Fusion Containers.

Physical Stores

Both the SSD and HDD of a Fusion Drive appear to macOS as separate physical disk devices. Both disks are GPT partitioned with a standard EFI partition and a second, larger partition, which takes up the bulk of the space on disk. For example, running the command diskutil list may show the HDD as /dev/disk0 with its primary partition as /dev/disk0s2 and the SSD as /dev/disk1 and /dev/disk1s2. These two partitions make up the physical stores of the Fusion Container.

Each physical store is formatted separately in much the same way as any other APFS container. Both will share the same nx_uuid in their [NX _Superblocks and have a separate, nearly-identical UUID in the nx_fusion_uuid field, with the most significant bit being cleared on the tier1 SSD partition and set on the tier2 HDD partition. The combination of these UUIDs can be used to identify the physical storage tiers of the container.

Synthesized Container

Both tiers are mapped together as a single “synthesized” container and are presented to macOS as a single logical block device (for example, /dev/disk2). The tier1 blocks are mapped at logical byte offset zero, and the tier2 blocks at 4 EiB. The offsets within the exabyte-scale gap between the two sets of blocks cannot be read.

APFS objects and blocks can be stored on either (or both) tiers, and their physical addresses will require some simple translation as follows:

#define FUSION_TIER2_DEVICE_BYTE_ADDR 0x4000000000000000ULL
const paddr_t first_tier2_block = FUSION_TIER2_DEVICE_BYTE_ADDR / nxsb->block_size;

if (paddr < first_tier2_block) {
  tier1->read_block(paddr); 
} else {
  tier2->read_block(paddr – first_tier2_block);
}

The logically exabyte-scale gap separating the two tiers presents a unique problem during digital forensic imaging Fusion Containers. To preserve the logical offsets of the evidence without having to use a data center worth of storage, you must use an evidence storage format that supports sparse imaging. As long as this is considered along with the additional physical address translation described above, analyzing fusion containers does not generally differ from other APFS containers.

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 20 - Snapshot Metadata

2022-12-28T00:00:00+00:00

Our previous discussion discussed how Object Maps facilitate the implementation of point-in-time Snapshots of APFS file systems by preserving File System Tree Nodes from earlier transactions. In that discussion, I outlined the on-disk structure of the Object Map Snapshot Tree and how it can be used to enumerate the transaction identifiers of each Volume Snapshot. Today, we will briefly discuss two other sources of information that store additional metadata about each Snapshot.

Snapshot Metadata Tree

The Snapshot Metadata Tree is a B-Tree whose physical address can be located by reading the apfs_snap_meta_tree_oid field of the Volume Superblock. It stores two types of objects, structured as File System Records.

Snapshot Metadata Records

Snapshot Metadata Records store the bulk of metadata about Volume Snapshots. The key-half is a j_snap_metadata_key structure with an encoded type of APFS_TYPE_SNAP_METADATA.

typedef struct j_snap_metadata_key {
  j_key_t hdr;           // 0x00
} j_snap_metadata_key_t; // 0x08

hdr: The record’s header. The object identifier in the header is the snapshot’s transaction identifier.

The value-half of the record is a j_snap_metadata_val_t structure and is immediately followed by the UTF-8 encoded name of the snapshot.

typedef struct j_snap_metadata_val {
  oid_t extentref_tree_oid;       // 0x00
  oid_t sblock_oid;               // 0x08
  uint64_t create_time;           // 0x10
  uint64_t change_time;           // 0x18
  uint64_t inum;                  // 0x20
  uint32_t extentref_tree_type;   // 0x28
  uint32_t flags;                 // 0x2C
  uint16_t name_len;              // 0x30
  uint8_t name[0];                // 0x32
} j_snap_metadata_val_t;

extentref_tree_oid: The physical object identifier of the B-Tree that stores extent references for the snapshot.
sblock_oid: The physical object identifier of a backup of the snapshot’s Volume Superblock
create_time: The time when the snapshot was created
change_time: The time that this snapshot was last modified
inum: reserved
extentref_tree_type: The type of the Extent Reference Tree
flags: A bit field that contains additional information about a snapshot metadata record
name_len: The length of the name that follows this structure (in bytes)

Snapshot Metadata Record Flags

Name	Value	Description
SNAP_META_PENDING_DATALESS	0x00000001	This snapshot is dataless, meaning that it does not preserve the file extents
SNAP_META_MERGE_IN_PROGRESS	0x00000002	The snapshot is in the process of being merged with another

Snapshot Name Records

Snapshot Name Records are used to map snapshot names to their transaction identifiers. The key-half of the record is a j_snap_name_key_t structure with an encoded type of APFS_TYPE_SNAP_NAME. It is followed by the UTF-8 encoded name of the snapshot.

typedef struct j_snap_name_key {
  j_key_t hdr;        // 0x00
  uint16_t name_len;  // 0x08
  uint8_t name[0];    // 0x0A
} j_snap_name_key_t;

hdr: The record’s header. The object identifier can be ignored.
name_len: The length of the name (in bytes)
name: The start of the UTF-8 encoded name

The value-half is a j_snap_name_val_t structure.

typedef struct j_snap_name_val {
  xid_t snap_xid;    // 0x00
} j_snap_name_val_t; // 0x08

snap_xid: The transaction identifier of the snapshot

Snapshot Extended Metadata Object

Each snapshot has a virtual Snapshot Extended Metadata Object in the volume’s Object Map. The virtual object identifier of this object is stored in the apfs_snap_meta_ext_oid field of the Volume Superblock. There are multiple versions of this object whose transaction identifiers correspond to each snapshot.

typedef struct snap_meta_ext_obj_phys {
  obj_phys_t smeop_o;        // 0x00
  snap_meta_ext_t smeop_sme; // 0x20
} snap_meta_ext_obj_phys_t;  // 0x48

smeop_o: The object’s header
smeop_sme: The snapshot’s extended metadata

typedef struct snap_meta_ext {
  uint32_t sme_version; // 0x00
  uint32_t sme_flags;   // 0x04
  xid_t sme_snap_xid;   // 0x08
  uuid_t sme_uuid;      // 0x10
  uint64_t sme_token;   // 0x20
} snap_meta_ext_t;      // 0x28

sme_version: The version of this structure (currently 1)
sme_flags: A bitfield of flags (none are currently defined)
sme_snap_xid: The transaction identifier of the snapshot
sme_uuid: The unique identifier of the snapshot
sme_token: An opaque token (reserved)

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 18 - Decryption

2022-12-26T00:00:00+00:00

Now that we know how to parse the File System Tree, Analyze Keybags, and Unwrap Decryption Keys, it’s time to put it all together and learn how to decrypt file system metadata and file data on encrypted volumes in APFS.

Tweaks

All encryption in APFS is based on the XTS-AES-128 cipher, which uses a 256-bit key and a 64-bit “tweak” value. This tweak value is position dependent. It allows the same plaintext to be encrypted and stored in different locations on disk and have drastically different ciphertext while using the same AES key. Every 512 bytes of encrypted data uses a tweak based on the container offset of the block’s initial storage.

Knowledge of the AES key alone is not always enough for successful decryption. If the encrypted block is ever relocated on disk, the data is not guaranteed to be re-encrypted with a new tweak. In these cases, the tweak can not be inferred based on the block’s on-disk location, so we must learn the original tweak value used for encryption.

Identifing Encrypted Blocks

There are primarily two sets of data protected with the APFS Volume Encryption Key: File System Tree Nodes and File Extents. As we’ve discussed, File System Tree Nodes store the File System Records that contain the file system’s metadata, and File Extents contain the bulk of the data stored in a file’s Data Streams.

Encrypted FS-Tree Nodes

A volume’s Object Map is never encrypted, but its referenced virtual objects may be, as is the case with FS-Tree Node on encrypted volumes.

Let’s revisit the value half of an Object Map entry.

typedef struct omap_val {
  uint32_t ov_flags; // 0x00
  uint32_t ov_size;  // 0x04
  paddr_t ov_paddr;  // 0x08
} omap_val_t;        // 0x10

If the ov_flags bit-field member has the OMAP_VAL_ENCRYPTED flag set, then the virtual object located at ov_paddr is encrypted. These objects are never related without being re-encrypted, so the tweak of the first 512 bytes of data can be determined by the physical location of the data using the following logic, with the following tweak values incremented for each subsequent 512 bytes of data:

uint64_t tweak0 = (ov_paddr * block_size) / 512;

Encrypted Extents

Extent data can be relocated on disk and is not guaranteed to be re-encrypted. Due to this, the initial tweak value is stored in the crypto_id field of the j_file_extent_val_t file system record:

typedef struct j_file_extent_val {
  uint64_t len_and_flags;  // 0x00
  uint64_t phys_block_num; // 0x08
  uint64_t crypto_id;      // 0x10
} j_file_extent_val_t;     // 0x18

Conclusion

We’ve now discussed all of the information needed to access data on software-encrypted APFS volumes. This decryption requires the knowledge of the password of any user on the system or one of the various recovery keys. While APFS hardware encryption works in largely the same manner, the encryption also depends on keys that are stored within the specific security chip on a given system. There are currently no known methods of extracting these chip-specific keys; therefore, the data on hardware-encrypted devices must be decrypted at acquisition time on the device itself. The only software that I am aware of that is capable of this is Cellebrite’s Digital Collector.

Full disclosure: I currently work for Cellebrite and helped develop these capabilities. I do not directly profit from the sales of Digital Collector but felt it appropriate to disclose my association when linking to a commercial product. I am not trying to sell you anything. Unfortunately, I am also not at liberty to discuss the methodology used to facilitate this decryption.

This post is part of my 2022 APFS Advent Challenge

Update: Blazingly Fast-er SIMD Checksums

2022-12-24T00:00:00+00:00

This is a quick update to yesterday’s post on using std::experimental::simd to speed up APFS Fletcher-64 calculations. It turns out that there were still some low-hanging optimizations that could be used to improve my code. I got better performance from my code by using a simple loop unrolling technique.

Here’s the new version of the function. Notice that the only difference is that I’m now calculating more data per iteration of the loop. I’m using a lambda here to avoid code de-duplication, but the compiler will gladly inline the code.

static uint64_t fletcher64_simd(std::span<const uint32_t, 1024> words) {
  vu64 sum1{};
  sum1[0] = -(static_cast<uint64_t>(words[0]) + words[1]);

  vu64 sum2{};
  sum2[0] = words[1];

  const auto calc = [&](size_t n) {
    sum2 += vu32::size() * sum1;

    const vu64 all{reinterpret_cast<const uint64_t*>(std::addressof(words[n])),
                  stdx::vector_aligned};

    const vu64 evens = all & max32;
    const vu64 odds = all >> 32;

    sum1 += evens + odds;
    sum2 += evens * even_m + odds * odd_m;
  };

  for (size_t n = 0; n < words.size(); n += vu32::size()) {
    calc(n);
    calc(n += vu32::size());
    calc(n += vu32::size());
    calc(n += vu32::size());
    calc(n += vu32::size());
    calc(n += vu32::size());
    calc(n += vu32::size());
    calc(n += vu32::size());
  }

  // Fold the 64-bit overflow back into the 32-bit value
  const auto fold = [&](uint64_t x) {
    x = (x & max32) + (x >> 32);
    return (x == max32) ? 0 : x;
  };

  const uint64_t low = fold(stdx::reduce(sum1));
  const uint64_t high = fold(stdx::reduce(sum2));

  const uint64_t ck_low = max32 - ((low + high) % max32);
  const uint64_t ck_high = max32 - ((low + ck_low) % max32);

  return ck_low | ck_high << 32;
}

Updated Results

Here are the updated relative performance statistics with the updated code running on the same hardware as yesterday’s tests. Amazing!

Target Architecture	Time per Checksum	Throughput	Speedup
SSE	217ns	17.5543 GiB/s	3.4x
AVX2	105ns	36.2421 GiB/s	7x
AVX-512	75ns	50.7305 GiB/s	9.7x
NEON	171ns	22.273 GiB/s	2.7x

2022 APFS Advent Challenge Day 17 - Blazingly Fast Checksums with SIMD

2022-12-23T00:00:00+00:00

Today’s post will take on a bit of a different style than the previous posts in this series. Among other things, I spent my day putting off writing the final APFS encryption blog post by pursuing another one of my New Year goals. Along the way, I wrote a Fletcher64 hashing function that can validate APFS objects at over 31 GiB/s on my 2017 iMac Pro. Rather than fighting my procrastination, I decided it would be better to share my findings. Given that my chosen learning path was directly relevant to APFS, I’m counting this as a valid APFS Advent Challenge post (and you can’t stop me!). I hope you enjoy this brief detour into the dark arts of cross-platform SIMD programming.

SIMD Background

I’ve recently become interested in learning more about SIMD programming and how to utilize it to make my code faster. SIMD stands for “Single Instruction, Multiple Data.” It is a technique used in computer architectures to perform the same operation on multiple data elements in parallel, using a single instruction.

Here’s an example to help illustrate the concept:

Imagine that you have a list of numbers and want to add 1 to each of them. Without SIMD, you would have to write a loop that goes through each number in the list and performs an increment operation. This may be a very time-consuming process if the list is long.

On the other hand, if your computer has SIMD support, it can simultaneously perform the same operation on multiple numbers using a single instruction. This process, known as vectorization, can significantly speed things up, especially for long lists of numbers. We’re not limited to simple increment operations; SIMD supports many arithmetic and logical operations on most architectures.

Speeding up Fletcher64

During my journey, I came across some prior work by James D Guilford and Vinodh Gopal describing using SIMD for the Fast Computation of Fletcher Checksums in ZFS. While ZFS uses a different variant of a Fletcher checksum than APFS, this seemed like a great first project to get my hands dirty with vectorization.

Portability Concerns

The authors of the Intel whitepaper use hand-coded AVX assembly instructions to perform their vectorization. OpenZFS seems to have taken the same approach. They have independent implementations full of inline assembly for Intel’s SSE, AVX2, AVX-512, and ARM’s NEON architectures. Apple takes a similar approach. The Intel version of apfs.kext contains an SSE, AVX, and AVX2 vectorized implementations and a fallback serialized version if, for some reason, none of these instruction sets are supported. The arm64 version of the KEXT uses NEON vectorization instructions.

While these approaches work and produce high-performing and optimized code, having to hand-code an implementation for each instruction set seems to defeat the purpose of writing code in a portable language like C++. Besides, I’m a programmer, and programmers, by our very nature, are lazy. The compiler knows what it’s doing and usually can generate better-performing code that I could hand optimize, so let’s find a way to let it do its job.

C++ TS N4808 and std::experimental::simd

It turns out that the C++ standardization committee, along with many brilliant minds, has been working on this problem for years. Document N4808, the Working Draft, C++ Extensions for Parallelism Version 2, is a proposal to add (among other things) support for portable data parallel types to the C++ standard library.

This technical document proposes a generalized model of the most common SIMD operations that standard library implementations can use to allow programmers to write vectorized code that can be compiled to architecture-specific instructions without requiring architecture-specific inline assembly. That sounds like exactly what we want! While this has not officially been added to the language, GCC’s libstdc++ and Clang’s libc++, have at least partial implementations in their std::experimental namespace. GCC support seems the most complete, so I decided to experiment with gcc-12.

The Implementation

std::experimental::simd allows you to define native C++ vector types whose storage capacity depends on the underlying target architecture. For example, NEON supports 128-bit SIMD registers, holding two 64-bit or four 32-bit integers. AVX2 supports twice the storage with 256-bit registers, and the aptly named AVX-512 supports 512-bit registers. We can write code once, and the size of the vectors will be architecture specific.

namespace stdx = std::experimental;

// SIMD vector of 64-bit unsigned integers
using vu64 = stdx::native_simd<uint64_t>;

// SIMD vector of 32-bit unsigned integers
using vu32 = stdx::native_simd<uint32_t>;

These SIMD vectors can be used almost exactly like native integer types, and once I got over the lack of documentation, I found that they were pretty easy to use. By taking lessons from the existing vectorized implementations and making some improvements of my own, this is what I was able to come up with:

// N, N-2, N-4, ..., 2
static const vu64 even_m{[](const auto i) { return vu32::size() - (2 * i); }};

// N-1, N-3, N-5, ..., 1
static const vu64 odd_m = even_m - 1;

static constexpr auto max32 = std::numeric_limits<uint32_t>::max();

static uint64_t fletcher64_simd(std::span<const uint32_t, 1024> words) {
  vu64 sum1{};
  sum1[0] = -(static_cast<uint64_t>(words[0]) + words[1]);

  vu64 sum2{};
  sum2[0] = words[1];

  for (size_t n = 0; n < words.size(); n += vu32::size()) {
    sum2 += vu32::size() * sum1;

    const vu64 all{reinterpret_cast<const uint64_t*>(std::addressof(words[n])),
                   stdx::vector_aligned};

    const vu64 evens = all & max32;
    const vu64 odds = all >> 32;

    sum1 += evens + odds;
    sum2 += evens * even_m + odds * odd_m;
  }

  // Fold the 64-bit overflow back into the 32-bit value
  const auto fold = [&](uint64_t x) {
    x = (x & max32) + (x >> 32);
    return (x == max32) ? 0 : x;
  };

  const uint64_t low = fold(stdx::reduce(sum1));
  const uint64_t high = fold(stdx::reduce(sum2));

  const uint64_t ck_low = max32 - ((low + high) % max32);
  const uint64_t ck_high = max32 - ((low + ck_low) % max32);

  return ck_low | ck_high << 32;
}

Results

Below are the speed comparisons between the above SIMD function and the following serialized implementation (non-threaded, single core performance). The times reported are the average time per checksum calculation of a 4KiB APFS object.

static uint64_t fletcher64_serial(std::span<const uint32_t, 1024> words) {
  uint64_t sum1 = -(static_cast<uint64_t>(words[0]) + words[1]);
  uint64_t sum2 = words[1];

  for (const uint32_t word : words) {
    sum1 += word;
    sum2 += sum1;
  }

  sum1 %= max32;
  sum2 %= max32;

  const uint64_t ck_low = max32 - ((sum1 + sum2) % max32);
  const uint64_t ck_high = max32 - ((sum1 + ck_low) % max32);

  static constexpr size_t high_shift = 32;
  return ck_low | ck_high << high_shift;
}

My 2017 iMac Pro supports enabling 128-bit SSE, 256-bit AVX2, and 512-bit AVX-512, so it’s a great candidate to show the speedups that can be achieved via vectorization.

Target Architecture	Time per Checksum	Throughput	Speedup
Serialized	730ns	5.21734 GiB/s	-
SSE	509ns	7.49126 GiB/s	1.4x
AVX2	292ns	13.0277 GiB/s	2.5x
AVX-512	122ns	31.1448 GiB/s	6x

The relative performance of my 2021 M1 Max MacBook Pro is somewhat less impressive due to the ARM NEON architecture being limited to only 128-bit vector registers. This computer is still very fast, and I love it.

Target Architecture	Time per Checksum	Throughput	Speedup
Serialized	458ns	8.31391 GiB/s	-
NEON	368ns	10.3417 GiB/s	1.2x

Conclusion

For the proper application, SIMD vectorization can provide fantastic performance benefits. In my testing, I demonstrated a 6x speedup and hashed APFS objects at over 31 Gigabytes per second on an iMac Pro from 2017! The proposed SIMD additions to the C++ standard library are easy to use and generate high-performing, portable code. I absolutely will be using this whenever I can.

Update (December 24, 2022)

I further improved this code’s performance to achieve even better performance!

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 16 - Wrapped Keys

2022-12-22T00:00:00+00:00

In our last post, we discussed both [Volume and Container Keybags](/post/2022/12/21/APFS-Keybags and how they protect wrapped Volume Encryption and Key Encryption Keys. Depending on whether the encrypted volume was migrated from an HFS+ encrypted Core Storage volume, there are subtle differences in how these keys are used. In this post, we will discuss the structure of these wrapped keys and how they can be used to access the raw Volume Encryption Keys that encrypt data on the file system.

Key Encryption Key Blobs

Each Key Encryption Key (KEK) is encoded in a binary DER blob with the following structure:

KEKBLOB ::= SEQUENCE {
    unknown [0] INTEGER
    hmac    [1] OCTET STRING
    salt    [2] OCTET STRING
    keyblob [3] SEQUENCE {
        unknown     [0] INTEGER
        uuid        [1] OCTET STRING 
        flags       [2] INTEGER
        wrapped_key [3] OCTET STRING
        iterations  [4] INTEGER
        salt        [5] OCTET STRING
    }
}

The keys begin with a header that contains an HMAC-SHA256 hash of the key blob data. The HMAC key is generated from the SHA-256 hash of a magic value concatenated with the given salt.

hmac_key := SHA256("\x01\x16\x20\x17\x15\x05" + salt)

The key blob encodes the wrapped KEK and additional information needed for unwrapping, including a set of bit-flags.

KEK Flags

Name	Value	Description
KEK_FLAG_CORESTORAGE	0x00010000’0000000000	Key is a legacy CoreStorage `KEK`
KEK_FLAG_HARDWARE	0x00020000’0000000000	Key is hardware encrypted

If the KEK_FLAG_CORESTORAGE flag is set, then the wrapped KEK was migrated from a Core Storage encrypted HFS+ volume and used a 128-bit key to encrypt the KEK; otherwise, a 256-bit key is used.

Generate a key using the PBKDF2-HMAC-SHA256 algorithm, the user’s password, the provided salt, and the number of iterations.

// Calculate size of wrapping key (in bytes)
key_size := (flags & KEK_FLAG_CORESTORAGE) ? 16 : 32

// Generate unwrapping key from user's password
key := pbkdf2_hmac_sha256(password, salt, iterations, key_size)

// Unwrap the encrypted KEK
kek := rfc3394_unwrap(key, wrapped_key);

If the encrypted volume was migrated from Core Storage and the user changed their password afterward, it’s possible to have a non-Core-Storage wrapped KEK containing only a 128-bit key. In these instances, the last 128 bits of the unwrapped KEK will be zeros and should be ignored.

// Shorten the KEK if needed
if is_zeroed(kek[16:]) {
    kek = kek[:16];
}

Volume Encryption Key Blobs

Volume Encryption Key (VEK) blobs have a very similar structure to the KEK blobs that we just discussed. Depending on if they were migrated from Core Storage, they can also be 128-bit or 256-bit keys.

VEKBLOB ::= SEQUENCE {
    unknown [0] INTEGER
    hmac    [1] OCTET STRING
    salt    [2] OCTET STRING
    keyblob [3] SEQUENCE {
        unknown     [0] INTEGER
        uuid        [1] OCTET STRING
        flags       [2] INTEGER
        wrapped_key [3] OCTET STRING
    }
}

VEK Flags

Name	Value	Description
VEK_FLAG_CORESTORAGE	0x00010000’0000000000	Key is a legacy CoreStorage `VEK`
VEK_FLAG_HARDWARE	0x00020000’0000000000	Key is hardware encrypted

Use the KEK to unwrap the VEK using the RFC3394 key wrapping algorithm. If the wrapped VEK is a 128-bit Core Storage VEK, then only the first 128-bits of the KEK are used.

// Calculate size of wrapping key (in bytes)
vek_size = (flags & VEK_FLAG_CORESTORAGE) ? 16 : 32;

if (vek_size == 16) {
    kek = kek[:16];
}

// Unwrap the VEK
vek = rfc3394_unwrap(vek, wrapped_key)

128-bit Core Storage VEKs must be extended to 256-bit encryption keys. This is accomplished by using the first 128 bits of the SHA256 hash of the VEK and its UUID as the second half of the key.

// 128-bit veks need to be combined with the first 128-bits of a hash
if vek_size == 16 {
    vek = append(vek, SHA256(vek + uuid)[16:])
}

Conclusion

In this post, we discussed utilizing the wrapped keys stored in APFS key bags to gain access to the Volume Encryption Key that protects a user’s data in APFS. Tomorrow, we will conclude our discussion about APFS encryption by describing how to identify and decrypt protected information using these keys.

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 15 - Keybags

2022-12-21T00:00:00+00:00

APFS is designed with encryption in mind and removes the need for the Core Storage layer used to provide encryption in HFS+. When you enable encryption on a volume, the entire File System Tree and the contents of files within that volume are encrypted. The type of encryption depends on the capabilities of the hardware that it is running on. For example, hardware encryption is used for internal storage on devices that support it, such as macOS computers with T2, M1, or M2 security chips and all iOS devices. Software encryption is used for external and internal storage devices without hardware encryption support. It’s worth noting that when hardware encryption is used, the data cannot be decrypted on any other device. For our purposes, we will focus on the software encryption mechanisms used in macOS. The hardware encryption functions similarly, but the security chip must broker all decryption operations.

Encryption Keys

In macOS, APFS uses a single Volume Encryption Key (VEK) to access encrypted content on a volume. This VEK is stored on disk wrapped in several layers of encryption that allow any authorized user on the system to access the volume’s contents. In addition, several recovery keys can be used to access the VEK.

The VEK is stored encrypted on disk by a Key Encryption Key (KEK). Multiple copies of the KEK are stored on disk, each encrypted (wrapped) with a different key to allow indirect access to the VEK by various users on a system. The keys that are used to encrypt the KEK can be derived from the following:

Each user’s password
The drive’s Personal Recovery Key
An organization’s Institutional Recovery Key
Each user’s iCloud Recovery Key

These wrapped keys are stored securely on disk in encrypted objects known as Keybags.

Keybags

Once decrypted, a Keybag is stored as a media_keybag_t structure on disk.

// A keybag object
typedef struct media_keybag {
    obj_phys_t mk_obj;     // 0x00
    kb_locker_t mk_locker; // 0x20

mk_obj: The object’s header
mk_locker: The keybag data

The main component of a Keybag is a kb_locker_t structure.

#define APFS_KEYBAG_VERSION 2

// A keybag
typedef struct kb_locker {
    uint16_t kl_version;         // 0x00
    uint16_t kl_nkeys;           // 0x02
    uint32_t kl_nbytes;          // 0x04
    uint8_t padding[8];          // 0x08
    keybag_entry_t kl_entries[]; // 0x10
} kb_locker_t;

kl_version: The keybag’s version (currently always 2)
kl_nkeys: Number of entries stored in the keybag
kl_nbytes: The size (in bytes) of the data stored in the kl_entries field
padding: reserved
kl_entries: The start of the entries

Immediately following the kb_locker_t structure is a keybag_entry_t structure for the first entry in the Keybag. After this structure is the data for the entry, followed by the structure for the next entry.

// An entry in a keybag
typedef struct keybag_entry {
    uuid_t ke_uuid;       // 0x00
    uint16_t ke_tag;      // 0x10
    uint16_t ke_keylen;   // 0x12
    uint8_t padding[4];   // 0x14
    uint8_t ke_keydata[]; // 0x18
} keybag_entry_t;

ke_uiid: A context-specific UUID that identifies the entry
ke_tag: A description of the kind of data stored in this keybag entry
ke_keylen: The length (in bytes) of the keybag entry’s data
padding: reserved
ke_keydata: ke_keylen bytes of entry data

Keybag Tags

Name	Value	Description
KB_TAG_UNKNOWN	0	reserved (never found on disk)
KB_TAG_RESERVED_1	1	reserved
KB_TAG_VOLUME_KEY	2	The key data stored a wrapped VEK
KB_TAG_VOLUME_UNLOCK_RECORDS	3	In a container’s keybag, the key data stores the location of the volumeʼs keybag; in a volume keybag, the key data stores a wrapped KEK.
KB_TAG_VOLUME_PASSPHRASE_HINT	4	The key data stores a user’s password hint as plain text
KB_TAG_WRAPPING_M_KEY	5	The key data stored a key that’s used to wrap a media key
KB_TAG_VOLUME_M_KEY	6	The key data stored a key that’s used to wrap volume media keys
KB_TAG_RESERVED_F8	0xF	reserved

Container Keybags

The nx_keylocker field of the container’s NX Superblock is used to locate the encrypted blocks on disk that store the Container Keybag. The XTS-AES-128 encryption key is a 256-bit key, derived from the container’s UUID. Read the 128-bit UUID from the nx_uuid field of the NX Superblock and concatinate it with itself.

container_keybag_key = container_uuid + container_uuid

Once decrypted the container keybag stores the location of each encrypted volume’s keybag as well as the wrapped VEK for each.

Volume Unlock Records

Volume Unlock Records are stored in the Container Keybag with a ke_tag value of KB_TAG_VOLUME_UNLOCK_RECORDS. The ke_uuid field stores the same UUID as the apfs_vol_uuid field of the Volume Superblock. The ke_keydata is a prange_t structure that gives the location of the encrypted blocks for the volume’s keybag.

Wrapped VEK

The wrapped VEK of a volume is stored in the Container Keybag with a ke_tag volume of KB_TAG_VOLUME_KEY with the ke_uuid also being the same as the volume’s UUID. This KEK must be unwrapped using the Key Encryption Key.

Volume Keybags

Each encrypted volume has its own keybag that stores the wrapped KEKs needed to access the VEK. For software encrypted APFS, these keybags are encrypted in the same fashion as the container keybags, using two copies of the volume’s UUID as a 256-bit XTS-AES-128 encryption key. Volume keybags can also store human-readable hints to remind user’s of their passphrases.

Volume Unlock Records

In the context of Volume Keybags, Volume Unlock Records store DER encoded information about wrapped KEKs. The ke_tag value is always KB_TAG_VOLUME_UNLOCK_RECORDS and the ke_uuid is either a cryptograpic user’s UUID or a hardcoded value to denote a recovery key. We will discuss more about the wrapped keys that are found in the ke_keydata field in tomorrow’s post.

Recovery Key UUIDs

Name	UUID
INSTITUTIONAL_RECOVERY_UUID	{C064EBC6-0000-11AA-AA11-00306543ECAC}
INSTITUTIONAL_USER_UUID	{2FA31400-BAFF-4DE7-AE2A-C3AA6E1FD340}
PERSIONAL_RECOVERY_UUID	{EBC6C064-0000-11AA-AA11-00306543ECAC}
ICLOUD_RECOVERY_UUID	{64C0C6EB-0000-11AA-AA11-00306543ECAC}
ICLOUD_USER_UUID	{EC1C2AD9-B618-4ED6-BD8D-50F361C27507}

Passphrase Hints

Passphrase Hint Records are stored with the ke_tag value of KB_TAG_VOLUME_PASSPHRASE_HINT and a cryptographic user’s UUID. The ke_keydata field contains a null-terminated UTF-8 string with the user’s provided passphrase hint.

Conclusion

This post discusses a general overview of APFS Keybags and their on-disk structures. In our next post, we will discuss methods of unwrapping and using the decryption keys.

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 14 - Sealed Volumes

2022-12-20T00:00:00+00:00

With the release of macOS 11, Apple added a security feature to APFS called sealed volumes. Sealed volumes can be used to cryptographically verify the contents of the read-only system volume as an additional layer of protection against rootkits and other malware that may attempt to replace critical components of the operating system. Sealed volumes have subtle differences from some of the properties of file systems that we’ve discussed so far.

Identifying a Sealed Volume

Sealed volumes can be identified by checking for the APFS_INCOMPAT_SEALED_VOLUME flag in the apfs_incompatible_features field of their Volume Superblock. In addition, the apfs_integrity_meta_oid and apfs_fext_tree_oid fields must have non-zero values.

An Integrity Metadata Object stores information about the sealed volume. This is a virtual object that is owned by the volume’s Object Map and whose object identifier can be found in the apfs_integrity_meta_oid field of the Volume Superblock. On disk, it is stored as an integrity_meta_phys_t structure.

typedef struct integrity_meta_phys {
    obj_phys_t im_o;               // 0x00
    uint32_t im_version;           // 0x20
    uint32_t im_flags;             // 0x24
    apfs_hash_type_t im_hash_type; // 0x28
    uint32_t im_root_hash_offset;  // 0x2C
    xid_t im_broken_xid;           // 0x30
    uint64_t im_reserved[9];       // 0x38
} integrity_meta_phys_t;           // 0x80

im_o: The object’s header
im_version: The version of the data structure
im_flags: The configuration flags
im_hash_type: The hash algorithm that is used
im_root_hash_offset: The offset (in bytes) of the root hash relative to the start of the object
im_broken_xid: The identifier of the transaction that unsealed the volume
im_reserved: reserved (only in version 2 or above)

Integrity Metadata Flags

Name	Value	Description
APFS_SEAL_BROKEN	0x00000001	The volume was modified after being sealed, breaking its seal

Hash Types

Name	Value	Description
APFS_HASH_INVALID	0	An invalid hash algorithm
APFS_HASH_SHA256	0x1	The SHA-256 variant of Secure Hash Algorithm 2
APFS_HASH_SHA512_256	0x2	The SHA-512/256 variant of Secure Hash Algorithm 2
APFS_HASH_SHA384	0x3	The SHA-384 variant of Secure Hash Algorithm 2
APFS_HASH_SHA512	0x4	The SHA-512 variant of Secure Hash Algorithm 2

File System Tree

Sealed Volumes can ensure integrity by hashing the contents of their File System Trees. This hashing necessitates some slight differences to the B-Tree. These modified B-Trees can be identified by the BTREE_HASHED and BTREE_NOHEADER flags being set in their B-Tree Info.

In standard B-Trees, non-leaf nodes store the object identifier of their children in the value-half of their entries. “Hashed” B-Trees instead use btn_index_node_val_t structures for this purpose, which store the cryptographic hash of the child node’s contents along with its identifier. Hashed nodes are also stored as headerless objects, with their 32-byte header being zeroed out.

#define BTREE_NODE_HASH_SIZE_MAX 64

typedef struct btn_index_node_val {
    oid_t binv_child_oid;                              // 0x00
    uint8_t binv_child_hash[BTREE_NODE_HASH_SIZE_MAX]; // 0x08
} btn_index_node_val_t;                                // 0x48

binv_child_oid: The object identifier of the child node
binv_child_hash: The hash of the child node

Data Stream Extents

As we discussed yesterday, Data Streams store their extents as file system records in the File System Tree. Sealed Volumes store extents in a separate File Extent Tree, whose virtual object identifier is stored in the apfs_fext_tree_oid of the Volume Superblock.

The key-half of the File Extent Tree entries are fext_tree_key_t structures and are sorted first by private_id and then by logical_addr.

typedef struct fext_tree_key {
    uint64_t private_id;   // 0x00
    uint64_t logical_addr; // 0x08
} fext_tree_key_t;         // 0x10

private_id: The object identifier of the file
logical_addr: The offset (in bytes) within the file’s data for the data stored in this extent

The value-half takes the form of a fext_tree_val_t structure. Its fields are interpreted in the same way as the j_file_extent_val fields. There is no crypto_id because sealed system volumes are never encrypted.

typedef struct fext_tree_val {
    uint64_t len_and_flags;  // 0x00
    uint64_t phys_block_num; // 0x08
} fext_tree_val_t;           // 0x10

len_and_flags: A bit field that contains the length of the extent and its flags
phys_block_num: The starting physical block address of the extent

Conclusion

Sealed Volumes in APFS provide an extra layer of security by allowing macOS to verify its system volume cryptographically. This post described some of the subtle differences in analyzing sealed volumes.

This post is part of my 2022 APFS Advent Challenge

2022 APFS Advent Challenge Day 13 - Data Streams

2022-12-19T00:00:00+00:00

Data in APFS that is too large to store within records are stored elsewhere on disk and referenced by data streams (dstreams). Similar to non-resident attributes in NTFS, APFS data streams manage a set of extents that reference the number and order of blocks on the disk which contain external data. In this post, we will discuss how data streams are used in APFS to manage one or more forks of data in inodes as well as their record structures in the File System Tree.

Inode Default Data Streams

Each file has a default data stream that stores what we typically refer to as the file’s data. This stream’s object identifier may or may not be different from the inode’s. It is stored in the private_id field of the inode’s j_inode_val_t structure. Metadata about the default data stream is stored as a j_dstream_t structure in an inode extended field with the type of INO_EXT_TYPE_DSTREAM.

typedef struct j_dstream {
    uint64_t size;                // 0x00
    uint64_t alloced_size;        // 0x08
    uint64_t default_crypto_id;   // 0x10
    uint64_t total_bytes_written; // 0x18
    uint64_t total_bytes_read;    // 0x20
} j_stream_t;                     // 0x28

size: The size of the logical data (in bytes)
alloced_size: The total space allocated for the data stream (in bytes), including any unused space
default_crypto_id: The default encryption key or tweak used in this data stream
total_bytes_written: The total number of bytes that have been written to this data stream
total_bytes_read: The total number of bytes that have been read from this data stream

The logical size and allocated size of a dstream may differ. The allocated size is always a factor of the container’s block size. If the file contents do not fill up the last block, then the allocated size may be larger than the logical size. APFS also allows dstreams to be sparsely allocated. Some extent ranges that logically contain all zero-bytes may not be stored on disk. In these instances, the allocated size may be smaller than the logical size of the stream.

The default_crypto_id comes in to play when we’re dealing with encrypted volumes. We will discuss more about APFS encryption in a future post.

The total_bytes_written and total_bytes_read fields are performance counters we can use to determine how often a data stream has been read-from or written-to. They are only periodically updated, and more research is needed to determine what triggers these values to be flushed to disk. Both values are allowed to overflow and reset from zero, so their utility for forensic analysis is relatively limited.

Extended Attributes

Along with the default data stream, files in APFS can also contain other forks. Like in HFS+, these additional data streams are called extended attributes but are similar in concept to alternate data streams in NTFS.

Extended attributes are stored in the File System Tree as records with a type identifier of APFS_TYPE_XATTR and the same object identifier as the inode record. The key half of an extended attribute record is a j_xattr_key_t structure.

typedef struct j_xattr_key {
    j_key_t hdr;       // 0x00
    uint16_t name_len; // 0x08
    uint8_t name[0];   // 0x0A
} j_xattr_key_t;

hdr: The record’s header
name_len: The length of the extended attribute’s name (in bytes), including the final null character.
name: The null-terminated, UTF-8 encoded name of the extended attribute

The value half of the extended attribute record is a j_xattr_val_t structure.

typedef struct j_xattr_val {
    uint16_t flags;     // 0x00
    uint16_t xdata_len; // 0x02
    uint8_t xdata[0];   // 0x04
} j_xattr_val_t;

flags: The extended attribute record’s flags
xdata_len: The length of the extended attribute’s inline data
xdata: The extended attribute data or the identifier of a data stream that contains the data

Extended Attribute Value Flags

Name	Value	Description
XATTR_DATA_STREAM	0x00000001	The attribute data is stored in a data stream
XATTR_DATA_EMBEDDED	0x00000002	The attribute data is stored directly in the record
XATTR_FILE_SYSTEM_OWNED	0x00000004	The extended attribute record is owned by the file system
XATTR_RESERVED_8	0x00000008	reserved

Like NTFS attributes, APFS extended attributes that are small enough can store their data directly in the attribute record itself. In these instances, the XATTR_DATA_EMBEDDED flag will be set and the stream’s data is stored in the xdata field.

Instead, when the XATTR_DATA_STREAM flag is set, xdata stores a j_xattr_dstream_t structure.

typedef struct j_xattr_dstream {
    uint64_t xattr_obj_id; // 0x00
    j_dstream_t dstream;   // 0x08
};                         // 0x30

xattr_obj_id: The object identifier of the extended attribute’s data stream
dstream: The metadata of the extended attribute’s data stream (see above)

Data Stream Extents

Except for Sealed Volumes__ (which we will discuss in the future), the _extents of a dstream are stored in the volume’s File System Tree as a set of records with the type APFS_TYPE_FILE_EXTENT. For streams with non-contiguous data, there will be more than one extent record.

The file extent record keys are of the type j_file_extent_key_t and encode the object identifier of the dstream in their record header, along with the logical offset of the extent in the stream.

typedef struct j_file_extent_key {
    j_key_t hdr;           // 0x00
    uint64_t logical_addr; // 0x08
} j_file_extent_key_t;     // 0x10

hdr: The record’s header
logical_addr: The offset within the file’s data (in bytes) for the data stored in this extent

The value half of a file extent record takes the form of a j_file_extent_val_t structure and is used to denote the physical location of the extent data on disk.

// length and flags masks
#define J_FILE_EXTENT_LEN_MASK 0x00ffffffffffffffULL
#define J_FILE_EXTENT_FLAG_MASK 0xff00000000000000ULL
#define J_FILE_EXTENT_FLAG_SHIFT 56

typedef struct j_file_extent_val {
    uint64_t len_and_flags;  // 0x00
    uint64_t phys_block_num; // 0x08
    uint64_t crypto_id;      // 0x10
} j_file_extent_val_t;       // 0x18

len_and_flags: A bit-field encoding the length (in bytes) of the extent in the 56 least significant bits and its flags in the most significant bits
phys_block_num: The physical block number of the first block in the extent
crypto_id: The encryption key or tweak used in this extent (or zero if not encrypted)

The eight most significant bits of the len_and_flags field are reserved for flags, but no flags are currently defined.

If the value of phys_block_num is zero, then the extent is sparse and should be interpreted as containing all zero bytes.

The crypto_id field is specific to encrypted volumes and will be discussed in a future post.

Conclusion

Understanding data streams and their on-disk structures are essential to analyzing APFS. This post discussed the default data stream, extended attributes, and file extents. Later this week, we will discuss how parsing this information differs in both Sealed and Encrypted volumes.