RAG: Build or Buy
- Cognesy Team
- Applications
- June 9, 2024
Sometimes I’m being asked whether it’s better to buy a commercial solution for RAG or build something using … (enter your preferred tech LlamaIndex, Langchain, etc.).
“Build” means higher costs and longer delivery times, plus ongoing maintenance. “Buy” means the solution’s mechanisms and components will be generic, as will the results.
RAG Out of the Box?
The problem with RAG (or more generally information retrieval) is that it’s still difficult to do well in a general way.
Vendors targeting a broad market build for common use cases, data sources, formats, query types, routing, retrieval, generation, and feedback mechanisms.
The broader the problem, the more costly and impractical it becomes to achieve high performance across all domains.
A tailored “build” solution for your specific use case can vastly outperform a general one and actually deliver what the RAG hype promises.
What Problem Are You Solving?
Ultimately, the preferred choice depends on the value of the problem to be solved.
If the goal is something like “let’s build a RAG/search tool so employees can have a new way to get information,” you probably don’t understand the value clearly.
Conversely, it might be:
“Our integration bus team currently handles ~50 questions a day, consuming 30% of their capacity (~10 FTEs = $1.5M/year). It prevents them from working on the backlog, resulting in $6M of lost value from the delayed product features for the programs relying on the bus. With a RAG solution meeting our spec, we will reduce the ticket related workload by 60%, freeing ~$1M in development capacity and enabling ~$4M in value / year.”
In this case, building (or tailoring an existing solution) is worth considering. Your CEO / CFO will see the value in spending $1-2M on a project if it yields $5-20M in benefits.
You can afford to develop everything needed to do the RAG as well as current technologies allow.
This includes use-case-specific data ingestion and processing, problem-specific query routing, search query transformation and enrichment that adds value, and results re-ranking that doesn’t just rely on the latest hyped API or model, with result validation working in concert with user feedback.
Conclusion
Building a dedicated solution which works takes time and money.
However, if your use case is “wouldn’t it be nice to talk to company internal docs,” you likely don’t need anything fancy.
“Buy” might be a better choice if your RAG initiative lacks clearly defined goals, or if you are just experimenting. It may be less attractive when your use case is specialized, and you can have access to the resources needed to build for it.
I expect that as the technology matures (and market saturates) specialized vendors will start offering RAG products providing tailor-made class results for narrow niches.
This is what has been happening in other categories of software. Until then, the right question is not whether to build or buy, but what is the problem you are solving.
The answer will follow logically from that.