AI Retrieval Startup SID Accuses Vector Database Chroma of Plagiarizing Research

robot
Abstract generation in progress

According to monitoring by 1M AI News, Max Rumpf, CEO of AI retrieval research company SID.ai, publicly accused the open-source vector database Chroma of heavily borrowing from SID’s SID-1 research published in December last year in a lengthy post on X. Rumpf claimed that Chroma’s newly released Context-1 model did not provide any citations or acknowledgments. He shared email correspondence with Chroma CEO Jeff Huber as evidence. In October 2025, Huber proactively inquired about what model Rumpf was training, to which Rumpf replied that he was working on an ‘agent retrieval model, similar to Cognition’s SWE-grep but for general retrieval, which is already stronger than Sonnet 4.5 and Gemini 2.5 Pro.’ After the SID-1 technical report was released in December 2025, Rumpf shared the link with Huber again, who responded with ‘Congratulations.’ Both companies are YC alumni and have offices next to each other. Both SID-1 and Context-1 are agent retrieval models trained using reinforcement learning, positioned as retrieval sub-agents for cutting-edge reasoning models, and both use synthetic data for training, claiming to achieve Pareto optimality in cost and latency. Rumpf listed specific similarities, including: Figure 1 using the same speed/cost dual-view switching, four-way parallel inference combined with RRF (Reciprocal Ranking Fusion) for aggregating results, as well as the overall framework of charts, datasets, and methodologies. The technical report for Context-1 cited related works such as WebExplorer, SWE-grep, and Search-R1, but did not mention SID-1 in the text, nor did the benchmark evaluations include SID-1 for comparison. Rumpf stated that Chroma ‘knowingly claimed Pareto optimality while another model exists’ and pointed out that although Context-1 has open-sourced its weights, the inference framework required for operation has not yet been released, preventing SID from conducting benchmark tests. Rumpf expressed that this practice ‘completely destroys the motivation for us (and others) to share in-depth in technical reports’ and referred to it as ‘a regrettable bad research practice in academia that is spreading to startups.’ As of the time of publication, Chroma had not publicly responded.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin