-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
priority:mediumShould be done soonShould be done soonsize:mMedium — 4 to 8 hoursMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectionRefined and ready for sprint selectiontype:featureNew functionalityNew functionality
Description
Part of #69
Depends on #90 (streaming spike)
Description
Implement a streaming CSV reader that feeds data to SQLite in chunks rather than loading the entire file into memory at once. Add --stream and --chunk-size flags.
Acceptance Criteria
-
--streamflag enables chunked processing mode -
--chunk-size <size>configures chunk size (default: 64MB, e.g.--chunk-size 128MB) - Simple queries (SELECT, WHERE, LIMIT) produce identical results to non-streaming mode
- Memory usage stays bounded by chunk size, not input file size
- Works correctly with piped stdin as well as file input
Notes
- Chunked reading means importing rows in batches and using SQLite transactions per chunk
- Result correctness for aggregates/GROUP BY depends on whether all data fits in temp storage (see Disk-backed large dataset support via SQLite temp storage #91)
- Start with a single-pass chunked insert, not a full virtual table implementation
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
priority:mediumShould be done soonShould be done soonsize:mMedium — 4 to 8 hoursMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectionRefined and ready for sprint selectiontype:featureNew functionalityNew functionality