Skip to content

feat: DataFusion query layer for parquet metrics#6268

Closed
alexanderbianchi wants to merge 1 commit intoquickwit-oss:matthew.kim/metrics-wide-schemafrom
alexanderbianchi:bianchi/wide-metrics-df-quickwit-clean
Closed

feat: DataFusion query layer for parquet metrics#6268
alexanderbianchi wants to merge 1 commit intoquickwit-oss:matthew.kim/metrics-wide-schemafrom
alexanderbianchi:bianchi/wide-metrics-df-quickwit-clean

Conversation

@alexanderbianchi
Copy link
Copy Markdown

Adds a DataFusion-based query execution layer on top of the wide-schema parquet metrics pipeline from PR #6237.

  • New crate quickwit-datafusion with pluggable QuickwitDataSource trait, DataFusionSessionBuilder, QuickwitSchemaProvider, and distributed worker session setup via datafusion-distributed WorkerService
  • MetricsDataSource implements QuickwitDataSource for OSS parquet splits: metastore-backed split discovery, object-store caching, filter pushdown with CAST unwrapping fix, Substrait ReadRel consumption
  • DataFusionService gRPC (ExecuteSql + ExecuteSubstrait streaming) wired into quickwit-serve alongside the existing searcher and OTLP services
  • Distributed execution: DistributedPhysicalOptimizerRule produces PartitionIsolatorExec tasks (not shuffles) across multiple searcher nodes
  • Integration tests covering: pruning, aggregation, time range, GROUP BY, distributed tasks, NULL column fill for missing parquet fields, Substrait named-table queries, rollup from file, partial schema projection
  • dev/ local cluster setup with start-cluster, ingest-metrics, query-metrics

Description

Describe the proposed changes made in this PR.

How was this PR tested?

Describe how you tested this PR.

@alexanderbianchi alexanderbianchi force-pushed the bianchi/wide-metrics-df-quickwit-clean branch from 431c8d2 to d4e7bbe Compare April 6, 2026 19:36
@mattmkim mattmkim deleted the branch quickwit-oss:matthew.kim/metrics-wide-schema April 6, 2026 19:48
@mattmkim mattmkim closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants