Skip to content

vectorlessflow/vectorless

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

213 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Vectorless

Crates.io Downloads Documentation License Rust

⚠️ Early Development β€” API may change.

What is Vectorless?

Vectorless is a Rust library for querying structured documents using natural language β€” without vector databases or embedding models.

Instead of chunking documents into vectors, Vectorless preserves the document's tree structure and uses a hybrid algorithm + LLM approach to navigate it β€” like how a human reads a table of contents:

  • Algorithm handles "how to walk" β€” BM25 scoring, tree traversal (fast, deterministic)
  • Pilot (LLM) handles "where to go" β€” semantic understanding, ambiguity resolution

Analogy: Traditional RAG is like searching every word in a book. Vectorless is like reading the table of contents, then going to the right chapter.

How It Works

How it works

1. Index: Build a Navigable Tree

Technical Manual (root)
β”œβ”€β”€ Chapter 1: Introduction
β”œβ”€β”€ Chapter 2: Architecture
β”‚   β”œβ”€β”€ 2.1 System Design
β”‚   └── 2.2 Implementation
└── Chapter 3: API Reference

Each node gets an AI-generated summary, enabling fast navigation.

2. Query: Navigate with LLM

When you ask "How do I reset the device?":

  1. Analyze β€” Understand query intent and complexity
  2. Navigate β€” LLM guides tree traversal (like reading a TOC)
  3. Retrieve β€” Return the exact section with context
  4. Verify β€” Check if more information is needed (backtracking)

Traditional RAG vs Vectorless

Traditional RAG vs Vectorless

Aspect Traditional RAG Vectorless
Infrastructure Vector DB + Embedding Model Just LLM API
Document Structure Lost in chunking Preserved
Context Fragment only Section + surrounding context
Setup Time Hours to Days Minutes
Best For Unstructured text Structured documents

Example

Input:

Document: 100-page technical manual (PDF)
Query: "How do I reset the device?"

Output:

Answer: "To reset the device, hold the power button for 10 seconds 
until the LED flashes blue, then release..."

Source: Chapter 4 > Section 4.2 > Reset Procedure

When to Use

βœ… Good fit:

  • Technical documentation
  • Manuals and guides
  • Structured reports
  • Policy documents
  • Any document with clear hierarchy

❌ Not ideal:

  • Unstructured text (tweets, chat logs)
  • Very short documents (< 1 page)
  • Pure Q&A datasets without structure

Quick Start

Installation

[dependencies]
vectorless = "0.1"

Configuration

cp vectorless.example.toml ./vectorless.toml

Usage

use vectorless::Engine;

#[tokio::main]
async fn main() -> vectorless::Result<()> {
    // Create client
    let client = Engine::builder()
        .with_workspace("./workspace")
        .build()?;

    // Index a document (PDF, Markdown, DOCX, HTML)
    let doc_id = client.index("./document.pdf").await?;

    // Query with natural language
    let result = client.query(&doc_id, "What are the system requirements?").await?;

    println!("Answer: {}", result.content);
    println!("Source: {}", result.path); // e.g., "Chapter 2 > Section 2.1"

    Ok(())
}

Features

Feature Description
Zero Infrastructure No vector DB, no embedding model β€” just an LLM API
Multi-format Support PDF, Markdown, DOCX, HTML out of the box
Incremental Updates Add/remove documents without full re-index
Traceable Results See the exact navigation path taken
Feedback Learning Improves from user feedback over time
Multi-turn Queries Handles complex questions with decomposition

Architecture

Architecture

Core Components

  • Index Pipeline β€” Parses documents, builds tree, generates summaries
  • Retrieval Pipeline β€” Analyzes query, navigates tree, returns results
  • Pilot β€” LLM-powered navigator that guides retrieval decisions
  • Metrics Hub β€” Unified observability for LLM calls, retrieval, and feedback

Examples

See the examples/ directory.

Contributing

Contributions welcome! If you find this useful, please ⭐ the repo β€” it helps others discover it.

Star History

Star History Chart

License

Apache License 2.0