Skip to content

Disk-backed large dataset support via SQLite temp storage #91

@vmvarela

Description

@vmvarela

Part of #69
Depends on #69-spike (evaluate streaming approaches)

Description

Implement the simpler streaming approach: configure SQLite to use disk-backed temp storage for large datasets. This allows all SQL operations to continue working on datasets larger than RAM, without changing query semantics.

Acceptance Criteria

  • --memory-limit <size> flag sets a hint for SQLite temp storage threshold (e.g. 256MB, 1GB)
  • SQLite configured with PRAGMA temp_store = FILE when memory limit is set
  • Temp files are cleaned up after execution
  • Queries that previously ran out of memory on 1GB+ files now complete successfully
  • Existing behavior unchanged when flag is not set
  • Error message if temp directory is not writable

Notes

  • Use PRAGMA temp_store_directory to configure temp location
  • No changes to query parsing or output — only storage backend changes
  • Follow spike recommendation on exact PRAGMA configuration

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:mediumShould be done soonsize:mMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectiontype:featureNew functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions