When the source corpus is poor. Unstructured PDFs with poor OCR (optical character recognition), documents with no metadata, superseded content not removed, and chunks that split mid-sentence all degrade retrieval quality. The AI layer cannot compensate for a bad corpus. Fixing the documents is usually the largest part of the project.

RAG vs Fine-Tuning: RAG for facts, fine-tuning for style – and almost always start with retrieval

Q: Can I use RAG and fine-tuning together?

Yes – fine-tune for voice and output format, RAG for facts. Mother Earth AI uses this combination. Fine-tuning gives the assistant a consistent persona; retrieval handles specific factual queries at runtime. Most production systems only need one or the other – start with RAG.

Q: How long does RAG take to set up compared to fine-tuning?

A RAG prototype over your own documents is typically production-ready within weeks – that was the Kompetenzz timeline. Fine-tuning requires labelled training data, training runs and validation cycles. It is longer to get right and harder to iterate on after deployment.

Q: What is the compliance risk with fine-tuning?

Two main issues. First, you cannot prove where a specific answer came from – citations are impossible with fine-tuning. Second, you cannot remove a specific fact without retraining – GDPR right to erasure cannot be satisfied. Both matter in regulated industries, which is why our production knowledge assistants all use RAG.

Q: Does RAG work with German documents?

Yes. Our production systems process German regulatory text – GDV insurance policy documents, BNetzA regulatory filings, EU Taxonomy classifications – with cited answers. Embedding model choice matters for non-English text; we test and disclose what we use.

Q: When does fine-tuning go wrong?

When developers treat it as a way to teach the model facts. It is not – it teaches patterns. Facts absorbed in fine-tuning can be reproduced inaccurately, cannot be audited and cannot be retracted. Use fine-tuning for format, style and classification. Use RAG for facts.

RAG vs Fine-Tuning: Which One Should You Use?

The short answer

RAG and fine-tuning solve different problems. RAG retrieves the relevant facts from your data at query time and asks the LLM to answer using them, with citations. Fine-tuning rewrites the model itself to absorb a style, format or behaviour – the facts become part of the weights.

For most "we want AI over our own documents" projects, RAG is the right call: it is cheaper, easier to update, easier to audit and easier to make compliant. We ship RAG as the default and reach for fine-tuning only when style or format truly cannot be solved with a prompt.

The honest comparison

RAG

Updating facts – edit the document, reindex – done in minutes
Citing sources – native; citations come straight from the retrieval step
Cost – vector database + LLM inference; predictable
Compliance posture – easy; source-of-truth is your database, auditable, deletable on request
Style and format – limited; RAG does not train the model, so output format and voice rely on prompting

Fine-tuning

Updating facts – retrain the model, validate, redeploy – hours to days, every time
Citing sources – not possible; the model cannot point at a source for a fact it absorbed in training
Cost – training compute up front, plus inference; higher floor, harder to estimate
Compliance posture – hard; removing a fact from a fine-tuned model is essentially impossible
Style and format – this is where it shines; consistent output format or voice that prompting cannot achieve

When to choose which

Choose RAG when:

Your source documents change more than monthly – regulations, contracts, product specifications
You need to cite where each answer came from – for compliance teams or end users who need to verify
GDPR right to erasure applies – you may need to remove specific information on request; with RAG, you delete the document and reindex
Your corpus is large and evolving – the same architecture scales from hundreds to tens of thousands of documents without retraining

Choose fine-tuning when:

You need a consistent output format that prompting alone cannot deliver – structured extraction schemas, fixed-length summaries, constrained JSON
You are building a persona with a specific voice, not a fact retrieval tool
The task is binary classification with hundreds of labelled examples and the base model underperforms even with good prompting
Your domain vocabulary is so specialist (highly technical, proprietary jargon) that the base model has no useful priors

Use both when:

Fine-tune for voice and format; RAG for the facts. The Mother Earth AI voice agent does this: a fine-tuned persona so the assistant sounds like itself, with retrieval handling the specific factual content it draws on.

What we do in practice

All of our production knowledge assistants – GDV, Kompetenzz, a leading German association and the chatbots inside Multilang Socialmap – run on RAG, not fine-tuning. The reasons are always the same: the source documents change, accuracy must be auditable, and compliance teams need to see where each answer came from.

The one place we lean towards fine-tuning is voice and persona work. Mother Earth AI (a voice agent built around the Allgemeine Erklärung der Rechte von Mutter Erde) uses a fine-tuned model so the assistant has a consistent voice and perspective – the indigenous oral traditions baked into the model itself, not retrieved on the fly.

Why N3XTCODER

We bring a decade of impact-tech experience and over 160 AI projects since 2019. Every production RAG system we have shipped uses the same architectural principle: the LLM reasons, the database stores, and every answer is traceable back to a source document.

GDV (German Insurers Association) – RAG over tens of thousands of insurance policy documents for 400+ member companies. Azure AI Search + GPT-4o via Microsoft AI Foundry. Research time halved, shadow AI use dropped.
Kompetenzz – RAG chatbot on n8n + Qdrant + GPT-4 via Microsoft EU, operated by a non-developer team for 1,000+ HumHub members.
Multilang Socialmap – multilingual RAG over the Paritätischer Berlin Socialmap, with full Leichte Sprache support, built to BITV 2.0 accessibility standards.
Mother Earth AI – fine-tuned open-source voice agent on a Raspberry Pi; the production example of fine-tuning for persona rather than facts.
Default stack: n8n in Berlin, Qdrant in the EU, Azure OpenAI via Microsoft EU Sovereignty. Open-source alternatives (Mistral, Milvus, Ollama) on request.

Honest constraints

RAG is only as good as your source documents. Unstructured PDFs with poor OCR, missing metadata, or superseded content left in the corpus produces irrelevant or wrong answers – and the citations make the errors look authoritative. Fixing the corpus is often the largest part of the project, not the AI layer.

Retrieval quality needs tuning. Chunk size, overlap, embedding model choice and metadata filters all affect what gets retrieved. A RAG prototype can feel impressive in a demo and degrade in production if these choices are not validated against real queries.

Fine-tuned models cannot cite sources. If a compliance requirement says "show me where this answer came from", fine-tuning cannot satisfy it. There is no retrieval step to point at. This disqualifies it for most regulated-industry knowledge assistant use cases.

Removing a fact from a fine-tuned model means retraining. GDPR right to erasure is effectively impossible on a fine-tuned model. If your use case involves personal data or information that may need to be deleted on request, fine-tuning is the wrong architecture.

Fine-tuning can produce confident wrong answers. The model absorbs patterns from training data and can generate plausible-sounding but invented responses in-domain. These are harder to catch than obvious failures because they feel grounded.

Want to talk it through? Book a call – free of charge.

Frequently asked questions

Can I use RAG and fine-tuning together?

How long does RAG take to set up compared to fine-tuning?

What is the compliance risk with fine-tuning?

Does RAG work with German documents?

When does RAG fail?

When does fine-tuning go wrong?

Talk through your AI project

Tell us what you are trying to ship. We will reply with a proposal and a date, usually within a working day.

Simon Stegemann
Co-Founder and CEO

Other Services

AI Chatbot

AI Chatbot. An intelligent customer support assistant that guides users to the right content and actions. Enhance your customer experience with 24/7 automated support.

AI Discovery Lab

Enhance your product or tech vision with AI, Machine Learning and data expertise.

AI Knowledge Assistant

AI Knowledge Assistant for your team. A customised AI chatbot that knows about your data. Get definitive answers from your data in seconds.