Skip to content

feat: backfill lower cache tiers on read-through#172

Merged
worstell merged 2 commits intomainfrom
tiered-cache-backfill
Mar 11, 2026
Merged

feat: backfill lower cache tiers on read-through#172
worstell merged 2 commits intomainfrom
tiered-cache-backfill

Conversation

@worstell
Copy link
Contributor

Problem

After a pod restart, disk cache is empty but S3 still has snapshots. Tiered.Open hits S3 and streams to the client, but never writes to disk. Since the cache hit path doesn't trigger mirror creation or periodic job scheduling, disk stays empty permanently — every request keeps hitting S3 (4 min for large repos vs 30s from disk).

Solution

When Tiered.Open finds data in a higher tier (S3) but the lowest tier (disk) missed, the returned reader now transparently tees writes to disk as the caller reads. After the full stream is consumed and closed, the disk entry becomes available for future reads.

On write failure or partial read, the backfill is safely abandoned via context cancellation per the Cache contract — reads are never affected.

When a higher tier (e.g., S3) has data but a lower tier (e.g., disk)
does not, the returned reader now transparently writes to the lowest
tier as the caller reads. This ensures disk cache is populated on the
first S3 hit after a pod restart, avoiding repeated slow S3 reads.

On write failure or partial read, the backfill is safely abandoned via
context cancellation per the Cache contract.

Amp-Thread-ID: https://ampcode.com/threads/T-019cda52-ee36-738c-86cd-1fd410c47d7f
Co-authored-by: Amp <amp@ampcode.com>
@worstell worstell requested a review from a team as a code owner March 11, 2026 01:43
@worstell worstell requested review from alecthomas and removed request for a team March 11, 2026 01:43
@worstell worstell force-pushed the tiered-cache-backfill branch from 2e66222 to fdb369c Compare March 11, 2026 01:56
@worstell worstell enabled auto-merge (squash) March 11, 2026 01:57
@worstell worstell merged commit 0dca77f into main Mar 11, 2026
5 checks passed
@worstell worstell deleted the tiered-cache-backfill branch March 11, 2026 01:58
Copy link
Collaborator

@alecthomas alecthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, should have been this way from the start 🤦‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants