Skip to content

GandalFran/contextomizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Contextomizer πŸ—œοΈβœ¨

npm version License Node.js Coverage CI PRs Welcome

Contextomizer is an ultra-fast, deterministic library for transforming bloated tool outputs, raw APIs, documents, and messy logs into perfectly optimized context for AI Agents. πŸ€–πŸš€

If you are building an AI agent, you know the struggle: tools return massive JSONs, error traces are hundreds of lines long, and HTML pages blow up your token budget instantly. Worst of all, you might be leaking API keys in the prompt! 😱

Contextomizer sits between your tools (like MCP servers, OpenAI functions, or Vercel AI SDK tools) and the LLM. It automatically:

  • πŸ“‰ Reduces tokens deterministically without extra LLM calls!
  • 🧹 Removes noise (HTML tags, generic log info).
  • πŸ” Redacts secrets securely before they hit the model.
  • 🧠 Preserves useful information intelligently (errors, structural bounds).
  • 🧩 Integrates seamlessly with AI frameworks!

🎭 Before & After (Example)

Input (Huge messy result with Secrets & Noise):

{
  "user": "Alice",
  "apiKey": "sk-live-123456789",
  "logs": "INFO starting...\nINFO loading x...\nERROR Connection failed at db.js:42\nINFO retry..."
}

Output (Contextomized for LLM):

{"user":"Alice","apiKey":"[REDACTED]","logs":"ERROR Connection failed at db.js:42\n...[LOGS TRUNCATED]"}

(Token cost reduced by 70%. Secrets secured. Noise removed. The AI gets exactly what it needs!)


πŸ“¦ Installation

npm install contextomizer

πŸš€ Basic Usage

The core of the library is the contextomize function. Just pass it your raw data, set your constraints, and let it do the magic! ✨

import { contextomize } from 'contextomizer';

const data = {
  veryImportantField: "Keep this, it's vital!",
  hugeArray: Array.from({length: 1000}).map((_, i) => ({ id: i, data: "bloat" })),
  secretToken: "Bearer sk-live-abc123def456.789"
};

const result = await contextomize(data, {
  maxTokens: 50, // Keep it tight!
  enableRedaction: true, // Hide those secrets!
  dropKeys: ['hugeArray'] // We don't need this bulk
});

console.log(result.forModel); 
// πŸ‘‰ Output is a clean, redacted string that fits perfectly in your prompt!

console.log(`Saved tokens: ${result.meta.estimatedSavedTokens} πŸ’ͺ`);

πŸ”Œ Advanced Integrations

Contextomizer shines when you plug it straight into your agent workflows! We provide ready-to-use adapters for the most popular ecosystems. 🌍

1. Model Context Protocol (MCP) Server Integration πŸ–₯️

If you are building an MCP Server, your tools return a specific CallToolResult format. Contextomizer has an adapter that wraps your output into the exact format that MCP clients (like Claude Desktop) expect, while applying token budgets!

import { MCPAdapter } from 'contextomizer/adapters/mcp';

const adapter = new MCPAdapter();

// Inside your MCP Server tool handler:
server.setRequestHandler(CallToolRequestSchema, async (request) => {
    try {
        const rawResult = await runMyHeavyDatabaseQuery(request.params.arguments);
        
        // Contextomizer formats it perfectly for MCP!
        return await adapter.decorateCallToolResult(rawResult, {
            maxTokens: 4000,
            enableRedaction: true
        });
    } catch (error) {
        // Formats errors beautifully too!
        return await adapter.decorateCallToolResult(error);
    }
});

2. Vercel AI SDK πŸš€

Wrap your tool definitions effortlessly so the Vercel AI SDK Agent only receives context-optimized results.

import { AISDKToolAdapter } from 'contextomizer/adapters/ai-sdk';
import { tool } from 'ai';
import { z } from 'zod';

const adapter = new AISDKToolAdapter();

const myHeavyTool = tool({
  description: 'Fetches huge system logs',
  parameters: z.object({ target: z.string() }),
  execute: async ({ target }) => {
    const hugeLogData = await fetchLogs(target);
    return hugeLogData; // Normally, this would crash your context window!
  }
});

// Wrap it!
export const optimizedTool = adapter.wrapTool(myHeavyTool, {
  maxTokens: 1000, // Now it will automatically truncate logs!
});

3. OpenAI Function Calling πŸ€–

If you are using the raw OpenAI SDK, you can wrap your function call results before appending them to the message history.

import { OpenAIToolAdapter } from 'contextomizer/adapters/openai';

const adapter = new OpenAIToolAdapter();

const rawResult = await executeRawFunction(toolCall);
const safeString = await adapter.wrapToolResult(rawResult, { maxTokens: 500 });

messages.push({
    role: "tool",
    tool_call_id: toolCall.id,
    content: safeString 
});

πŸ›‘οΈ Content Detection & Reducers

Contextomizer automatically detects what you throw at it and applies the best reduction strategy organically! 🧬

  • πŸ“„ JSON: Drops keys, truncates deep nesting, prioritizes defined paths.
  • 🌐 HTML: Strips <script>, <style>, and <svg>, keeping only readable semantic text.
  • πŸ“‹ Logs: Keeps ERROR and FATAL lines, truncates generic INFO spam when over budget!
  • 🚨 Error Traces: Preserves the core error message and root cause, shedding useless stack frame bloat.
  • πŸ“ Plain Text: Intelligent token-aware string truncation.

🧠 Model Assist (Optional AI Overdrive)

While Contextomizer is proudly deterministic and pure by default, sometimes you really need to compress a 50,000-word document into 500 tokens without losing the semantic meaning.

For this, you can plug in any LLM via the Model Assist Provider! 🎩✨

Contextomizer ships with built-in, zero-dependency providers for OpenAI and Anthropic (Claude) that use native fetch under the hood.

import { contextomize } from 'contextomizer';
import { OpenAIProvider, AnthropicProvider } from 'contextomizer/model-assist';

// Using OpenAI
const result = await contextomize(hugeDocument, {
  maxTokens: 500,
  enableModelAssist: true,
  modelAssistProvider: new OpenAIProvider({ 
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o-mini' // Optional, defaults to gpt-4o-mini
  })
});

// Or using Anthropic (Claude)
const claudeResult = await contextomize(hugeDocument, {
  maxTokens: 500,
  enableModelAssist: true,
  modelAssistProvider: new AnthropicProvider({ 
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: 'claude-3-5-haiku-latest' // Optional
  })
});

Example: Implementing a Simple Provider

The core library provides the abstract interface for ModelAssistProvider. You inject your preferred LLM client!

Here is how easily you can build your own ModelAssistProvider to call your company's internal model API, for example:

import { AbstractModelAssistProvider, ModelAssistInput, ModelAssistOutput } from 'contextomizer';

export class MyInternalModelProvider extends AbstractModelAssistProvider {
  async summarize(input: ModelAssistInput): Promise<ModelAssistOutput> {
    // Call your own internal API or any other open-source model endpoint
    const response = await fetch('https://api.mycompany.internal/v1/summarize', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        text: input.text,
        maxTokens: input.targetTokens,
        contextType: input.detectedType
      })
    });
    
    const data = await response.json();
    return { 
      text: data.summary,
      estimatedTokens: data.usedTokens
    };
  }
}

πŸ› οΈ Configuration Options

Option Type Default Description
maxTokens number undefined The strict token budget for the result.
enableRedaction boolean false Scans and masks secrets (API keys, tokens, etc.).
dropKeys string[] [] JSON keys to blindly drop during reduction.
keepKeys string[] [] JSON keys to forcefully keep at all costs.
logger ILogger console Inject a custom logger to trace the inner reduction pipeline!
enableModelAssist boolean false Falls back to an LLM provider if deterministic reduction fails.

🀝 Contributing

We love contributions! Feel free to open issues or PRs. Make sure you run tests:

npm run test
npm run lint

πŸ“œ License

MIT License. See LICENSE for more details. Build safely! 🏰✨

About

Contextomizer is an ultra-fast, deterministic library for transforming bloated tool outputs, raw APIs, documents, and messy logs into perfectly optimized context for AI Agents πŸ€–πŸš€

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors