Skip to content

External sort for ORDER BY in streaming mode #94

@vmvarela

Description

@vmvarela

Part of #69
Depends on #91 (disk-backed storage), #93 (streaming query validation)

Description

Implement external merge sort to support large ORDER BY operations in streaming mode without loading the full sorted dataset into memory.

Acceptance Criteria

  • ORDER BY queries in --stream mode produce correctly sorted output for datasets larger than available memory
  • External sort uses temp files (respects --memory-limit for merge buffer size)
  • Temp files are cleaned up after execution (including on error/interrupt)
  • Performance is reasonable: sorting 1GB CSV should complete in under 5 minutes on typical hardware
  • Query validation (Streaming query validation: detect full-table-scan requirements #93) no longer blocks ORDER BY when external sort is available

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:lowNice to have, do when possiblesize:mMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectiontype:featureNew functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions