
Combinatorially-Expressive Retrieval (CER) is a three-stage hybrid IR system—BM25 for high-recall lexical matching, ColBERTv2 for late-interaction semantic reranking, and a cross-encoder for final judgment—combined via monotonic linear score fusion to preserve consensus orderings. The paper argues this design sidesteps the rank/sign-rank limits that cap single-vector dense retrievers, effectively yielding unbounded ranking capacity in theory and robust performance in practice. On the LIMIT benchmark, where strong dense models collapse (~10–15% Recall@10), CER reaches 97.4% Recall@100 and 96.4% Recall@2, while an optimized setup achieves ~0.37 s/query on a single Apple M4 Max—suggesting high accuracy without heavy infrastructure. The approach reframes retrieval: architectural hybridity, not ever-larger embeddings, is key for combinatorial queries and next-gen RAG.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
