How to Build an AI Chatbot: A working chatbot in four sprints, grounded in your own data

Q: Should we fine-tune or use RAG?

Almost always RAG (retrieval-augmented generation). Fine-tuning is rarely needed and expensive to maintain. RAG grounds answers in your real documents and returns clickable citations users can verify.

Q: Which model should we use?

For most cases, GPT-4 via Microsoft EU sovereignty is the right starting point. Mistral Medium 3 is a good fully EU-based alternative. You can swap later without rewriting the application.

Q: How do we handle hallucinations?

Ground every answer in your real documents, return citations users can click, restrict the model to your sources, and keep a human in the loop where the cost of an error is high.

Q: How long does it take to ship?

Our RAG chatbot for Kompetenzz was delivered in four sprints – about 10 working days of effort. Most chatbots follow a similar timeline.

Q: Where does our data go?

Where you decide. We default to EU-hosted infrastructure: n8n in Berlin, Qdrant in the EU, Azure OpenAI via Microsoft EU. Open-source self-hosted EU alternatives like Milvus on request.

Q: Can our team maintain it after delivery?

Yes. We hand over a low-code architecture and documentation specifically so non-technical staff can operate and extend the system. The team at Kompetenzz does this for 1,000+ members.

To build an AI chatbot that actually works: pick one concrete user need, ground it in your own data using retrieval-augmented generation (RAG), build the smallest useful version on a low-code EU-hosted stack, and put it in front of real users fast.

Cite sources. Keep a human in the loop where mistakes are expensive. Iterate. Most chatbot projects fail not because the technology is hard but because they are not grounded in real data and real user needs.

What this means in practice

The clearest internal chatbot guide is a worked example. Kompetenzz came to us with an unreliable knowledge chatbot. Off-the-shelf OpenAI Assistants with file search had not delivered the accuracy they needed. The chatbot had to live inside HumHub – their existing social network – and be operated by a non-technical team.

We delivered Version 1 in four short sprints: system architecture, RAG implementation with semantic search, HumHub integration, full documentation. Total estimated effort 10 working days. Stack: n8n in Berlin for workflow orchestration, Qdrant in the EU for vector search, GPT-4 served via Microsoft EU sovereignty as the LLM. Optional fully open-source EU alternatives: Mistral Medium 3 as the model, Milvus as the vector database. Version 1 is now in production serving more than 1,000 network members, time-aware, operated by a non-developer team.

The same RAG architecture supports GDV (German Insurers Association) across tens of thousands of policy documents for 400+ member companies, and an AI Member Platform for a leading German association combining chat-based discovery with traditional category filters.

Example: AI Knowledge Assistant in action – the RAG architecture this guide describes

By playing the video, you consent to the transfer of data to the respective video platform (e.g., YouTube, Vimeo).

Key components

RAG, not fine-tuning icon

RAG, not fine-tuning

Retrieval-augmented generation grounds answers in your own documents
Almost always the right pattern – fine-tuning is rarely needed and expensive to maintain

EU-compliant stack icon

EU-compliant stack

n8n in Berlin, Qdrant in the EU, GPT-4 via Microsoft EU sovereignty
Mistral or Milvus where you want fully open-source EU alternatives to GPT-4o and Qdrant

Cited and verifiable icon

Cited and verifiable

Every answer links back to the source document
Conversational refinement so users can narrow their query in natural language

Outcomes

Production accuracy icon

Production accuracy

Kompetenzz chose RAG after off-the-shelf OpenAI Assistants proved unreliable; ours runs in production for 1,000+ members

Time to value icon

Time to value

delivered in four sprints – the Kompetenzz timeline: system architecture, RAG, integration, documentation

Maintained by your team icon

Maintained by your team

low-code architecture so non-developers can operate and extend it after handover

Hallucinations under control

answers grounded in your real documents, sources cited and clickable, human-in-the-loop where it matters

EU AI Act ready

risk classified, GDPR aligned, with audit trails and citations built into the architecture

Want to talk it through? Book a call – free of charge.

How it works

1. Architecture and scope

Pick the data sources, the integrations and the model approach
Plan the delivery schedule – scoped the same way we did for Kompetenzz

2. Build and iterate

Working software at the end of every sprint
Real users in front of it as soon as possible
Citations and audit trails as default

3. Hand over

Documentation a non-technical owner can use
Training so your team can extend the system without us
Optional ongoing support

Why N3XTCODER

We bring a decade of impact-tech experience and more than 160 AI projects since 2019. Through our free AI for Impact course, more than 100,000 people have learned how to use AI for the common good. We do not run inspiration days. We run scoping sessions and build engagements that ship, the way we have delivered AI for the organisations below:

Kompetenzz – production retrieval-augmented generation (RAG) chatbot serving 1,000+ HumHub members on n8n + Qdrant + GPT-4 via Microsoft EU, delivered in four sprints
GDV (German Insurers Association) – AI Knowledge Assistant over tens of thousands of policy documents for 400+ member companies
A leading German association – AI Member Platform ("Association GPT") combining chat-based discovery with traditional category filters, on Microsoft AI Foundry + pgvector
innatura – AI email agent classifying enquiries and drafting replies with mandatory human review, currently in pilot, on N8N and Azure OpenAI
Tannenhof Berlin-Brandenburg – Civic Coding-funded AI transcription pilot for therapy sessions on EU-hosted infrastructure, with output formatted for German Pension Insurance reporting
Civic Coding – AI consultation across 100 social-impact projects under Germany's federal initiative
Default stack: n8n in Berlin, Qdrant in the EU, Azure OpenAI via Microsoft EU sovereignty, plus open-source EU alternatives like Mistral and Milvus on request.

Honest constraints

Off-the-shelf assistants with file search are not enough for production. Kompetenzz tried this first. It was unreliable. RAG with proper grounding and citations is what makes the difference.

Fine-tuning is almost never the right answer. It is expensive, hard to maintain, and does not solve the hallucination problem. Use RAG against your real documents instead.

Hallucinations cannot be fully eliminated. Mitigate them with grounding, citations, restricted prompts, and human review on consequential outputs. Accept that 100 percent accuracy is not the goal; defensible accuracy is.

Frequently asked questions

Should we fine-tune or use RAG?

Which model should we use?

How do we handle hallucinations?

How long does it take to ship?

Where does our data go?

Can our team maintain it after delivery?

Build your AI chatbot with N3XTCODER

Tell us about your documents and the questions your team or members keep asking. We will reply with a proposal and a date.

Simon Stegemann
Co-Founder and CEO

Other Services

AI Chatbot

AI Chatbot. An intelligent customer support assistant that guides users to the right content and actions. Enhance your customer experience with 24/7 automated support.

AI Discovery Lab

Enhance your product or tech vision with AI, Machine Learning and data expertise.

AI Knowledge Assistant

AI Knowledge Assistant for your team. A customised AI chatbot that knows about your data. Get definitive answers from your data in seconds.