GitHub - vectorlessflow/vectorless: Vectorless is a hierarchical, reasoning-native document intelligence engine. 🌟 Star if you like it!

⚠️ Early Development — API may change.

What is Vectorless?

Vectorless is a Rust library for querying structured documents using natural language — without vector databases or embedding models.

Instead of chunking documents into vectors, Vectorless preserves the document's tree structure and uses a hybrid algorithm + LLM approach to navigate it — like how a human reads a table of contents:

Algorithm handles "how to walk" — BM25 scoring, tree traversal (fast, deterministic)
Pilot (LLM) handles "where to go" — semantic understanding, ambiguity resolution

Analogy: Traditional RAG is like searching every word in a book. Vectorless is like reading the table of contents, then going to the right chapter.

How It Works

1. Index: Build a Navigable Tree

Technical Manual (root)
├── Chapter 1: Introduction
├── Chapter 2: Architecture
│   ├── 2.1 System Design
│   └── 2.2 Implementation
└── Chapter 3: API Reference

Each node gets an AI-generated summary, enabling fast navigation.

2. Query: Navigate with LLM

When you ask "How do I reset the device?":

Analyze — Understand query intent and complexity
Navigate — LLM guides tree traversal (like reading a TOC)
Retrieve — Return the exact section with context
Verify — Check if more information is needed (backtracking)

Traditional RAG vs Vectorless

Aspect	Traditional RAG	Vectorless
Infrastructure	Vector DB + Embedding Model	Just LLM API
Document Structure	Lost in chunking	Preserved
Context	Fragment only	Section + surrounding context
Setup Time	Hours to Days	Minutes
Best For	Unstructured text	Structured documents

Example

Input:

Document: 100-page technical manual (PDF)
Query: "How do I reset the device?"

Output:

Answer: "To reset the device, hold the power button for 10 seconds 
until the LED flashes blue, then release..."

Source: Chapter 4 > Section 4.2 > Reset Procedure

When to Use

✅ Good fit:

Technical documentation
Manuals and guides
Structured reports
Policy documents
Any document with clear hierarchy

❌ Not ideal:

Unstructured text (tweets, chat logs)
Very short documents (< 1 page)
Pure Q&A datasets without structure

Quick Start

Installation

[dependencies]
vectorless = "0.1"

Configuration

cp vectorless.example.toml ./vectorless.toml

Usage

use vectorless::Engine;

#[tokio::main]
async fn main() -> vectorless::Result<()> {
    // Create client
    let client = Engine::builder()
        .with_workspace("./workspace")
        .build()?;

    // Index a document (PDF, Markdown, DOCX, HTML)
    let doc_id = client.index("./document.pdf").await?;

    // Query with natural language
    let result = client.query(&doc_id, "What are the system requirements?").await?;

    println!("Answer: {}", result.content);
    println!("Source: {}", result.path); // e.g., "Chapter 2 > Section 2.1"

    Ok(())
}

Features

Feature	Description
Zero Infrastructure	No vector DB, no embedding model — just an LLM API
Multi-format Support	PDF, Markdown, DOCX, HTML out of the box
Incremental Updates	Add/remove documents without full re-index
Traceable Results	See the exact navigation path taken
Feedback Learning	Improves from user feedback over time
Multi-turn Queries	Handles complex questions with decomposition

Architecture

Core Components

Index Pipeline — Parses documents, builds tree, generates summaries
Retrieval Pipeline — Analyzes query, navigates tree, returns results
Pilot — LLM-powered navigator that guides retrieval decisions
Metrics Hub — Unified observability for LLM calls, retrieval, and feedback

Examples

See the examples/ directory.

Contributing

Contributions welcome! If you find this useful, please ⭐ the repo — it helps others discover it.

Star History

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
vectorless.example.toml		vectorless.example.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is Vectorless?

How It Works

1. Index: Build a Navigable Tree

2. Query: Navigate with LLM

Traditional RAG vs Vectorless

Example

When to Use

Quick Start

Installation

Configuration

Usage

Features

Architecture

Core Components

Examples

Contributing

Star History

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is Vectorless?

How It Works

1. Index: Build a Navigable Tree

2. Query: Navigate with LLM

Traditional RAG vs Vectorless

Example

When to Use

Quick Start

Installation

Configuration

Usage

Features

Architecture

Core Components

Examples

Contributing

Star History

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages