Ai4 2025: Tengyu Ma outlines the future of retrieval-augmented generation for enterprise AI

by CIJ News iDesk III

2025-08-12 04:41

/uploads/posts/ac4e70583f42db9756b39e038af050798726db52/images/185341481.jpg

The afternoon session of Ai4’s opening day turned to the technical frontier of enterprise AI, as Tengyu Ma, Chief AI Scientist at MongoDB and Assistant Professor of Computer Science and Statistics at Stanford University, delivered a keynote titled “RAG in 2025: State of the Art and the Road Forward.”

Ma began by framing the problem: while large language models (LLMs) and agentic systems have driven much of the recent AI wave, their effectiveness in enterprise settings is limited without access to proprietary data. “Off-the-shelf models are trained on public data,” he explained. “They’re brilliant, but they don’t know your data—and that’s where your competitive edge lies.”

He outlined three common approaches to integrating enterprise data with AI:
• Naïve concatenation, where all data is fed into a model’s prompt, offering completeness but at high computational cost.
• Fine-tuning, which “burns” knowledge into model parameters, useful for well-curated datasets but costly and inflexible.
• Retrieval-Augmented Generation (RAG), which selectively retrieves relevant information before passing it to the model—a method Ma advocates for as reliable, modular, and cost-effective.

MongoDB’s role, he said, is to act as both “the library and the librarian,” integrating database storage with high-quality AI-powered retrieval. By tightly coupling retrieval capabilities with the database layer, MongoDB enables LLMs to access only the most relevant data, improving accuracy and reducing hallucination.

Ma delved into the technical underpinnings, from domain-specific embeddings—specialized for areas like code, finance, or law—to hybrid search techniques that blend keyword and vector search. He highlighted MongoDB’s work on automating “chunk enrichment,” ensuring that contextual information is preserved when documents are split into smaller pieces for processing. This automation, he noted, removes a major bottleneck for developers and boosts retrieval accuracy.

He also stressed the importance of controllability. Using the example of a search query for “Jaguar” that returns both animal and automobile results, Ma argued that retrieval systems must incorporate user-defined preferences. MongoDB’s approach allows these preferences to be expressed in natural language, much like a system prompt for an LLM, without requiring complex fine-tuning.

Performance benchmarks, he said, show measurable improvements in accuracy and relevance across both public datasets and real-world customer scenarios.

Ma closed with a vision for the future of RAG: “The goal is to make retrieval simpler, more automated, and more controllable—so you can focus on your data and your domain, and let AI do the heavy lifting.”

The session reinforced RAG’s growing role as a bridge between powerful general-purpose AI models and the proprietary data that enterprises rely on, with MongoDB positioning itself as a central player in this evolving stack.

Robert Fletcher, CEO and Editor-in-Chief at CIJ EUROPE, is attending the event to cover the latest AI innovations, conduct interviews, and participate in panel discussions. His reports will appear in CIJ EUROPE’s August coverage and the Q3 issue of CIJ EUROPE magazine, bringing insights from Ai4 directly to the publication’s readership across the real estate and business sectors.

encompassme

Gateway

Ai4 2025: Tengyu Ma outlines the future of retrieval-augmented generation for enterprise AI