Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Report
Data sources: ZENODO
addClaim

MonaVec: A Training-Free Embedded Vector Search Kernel for Edge and Offline AI Systems

Authors: Yenen, Oğuzhan; mona;

MonaVec: A Training-Free Embedded Vector Search Kernel for Edge and Offline AI Systems

Abstract

We present MonaVec, a training-free, deterministic embedded vector-search kernel for edge and offline AI: it delivers high recall at a 4-bit (8x-smaller) memory footprint, reproduces the same top-K results on any device (byte-identical within a build), and is exposed through a CLI, REST API, and web UI -- the SQLite of vector search. MonaVec combines a data-oblivious quantization pipeline (Randomized Hadamard Transform followed by Lloyd-Max scalar quantization) with three index backends BruteForce, IvfFlat, HNSW), SIMD-accelerated scoring (AVX-512, AVX2, NEON), and a service layer with hybrid sparse-dense retrieval (BM25 + dense) and pluggable identity-based multi-tenancy. It requires zero training data, runs offline, and persists as a single .mvec file in pure Rust with Python bindings. On AG News (45K x 1024-dim, BGE-M3, cosine), 4-bit BruteForce reaches 0.960 Recall@10 in 27 MB and 4-bit HNSW reaches 0.954, leading float32 FAISS-IVF and 8-bit usearch on recall while trading peak throughput for byte-identical determinism. On glove-100 (1.18M x 100-dim), BruteForce (0.865) tops every graph index evaluated. On fashion-mnist (60K x 784-dim, L2), global standardization improves BruteForce Recall@10 from 0.41 to 0.62. We additionally validate portable determinism on aarch64 hardware.

Powered by OpenAIRE graph
Found an issue? Give us feedback