Rawq: Open-Source Semantic Code Search Engine Cuts AI Agent Token Waste by 4x
Key Takeaways
- ▸Reduces irrelevant file retrieval by ~90%, cutting AI agent token waste from 50+ files to 5-10 relevant code chunks per search
- ▸Fully offline hybrid search combining semantic embeddings (ONNX) with lexical BM25, optimized for codebases up to 10k+ files
- ▸Cross-platform support with automatic GPU acceleration (DirectML/CUDA/CoreML), daemon mode for hot model loading, and agent-native output (JSON, streaming, token budgets)
Summary
Rawq, a new open-source semantic code search tool, addresses a critical inefficiency in AI agent development: excessive token consumption from reading irrelevant code files. Built as a single Rust binary with offline capabilities, rawq uses hybrid semantic and lexical search to pinpoint only relevant code chunks from large codebases, reducing results from 50+ files to just 5-10 targeted chunks. This dramatic reduction in token usage translates directly to lower inference costs and faster agent responses.
The tool combines ONNX-based semantic embeddings with BM25 lexical search, supporting 16 programming languages through tree-sitter AST parsing. Rawq is fully offline after an initial model download, includes GPU acceleration (DirectML, CUDA, CoreML), and features agent-friendly output formats including JSON and NDJSON streaming. Installation is frictionless—a single curl command on macOS/Linux or PowerShell on Windows—with optional Cargo installation for developers using the Rust toolchain.
- Open-source Rust implementation with incremental indexing, git-aware change detection, and support for 16 languages plus universal fallback for any text file type
Editorial Opinion
Rawq addresses a real pain point in AI agent engineering—inefficient context retrieval that bloats token counts and slows inference. By combining semantic and lexical search with agent-friendly APIs, it's a practical tool for developers building production AI systems. The fully offline design and open-source approach lower barriers to adoption, though its impact will ultimately depend on integration with popular AI frameworks and agent platforms.


