From Reference Architecture to Production — How Enterprise AI Chat Actually Gets Built

I've built enterprise AI chat twice. Once clean. Once in the real world.

The clean version is what I show at conferences. Tight architecture, readable code, works every time I demo it.

The real world version is what I build for clients. Same architecture underneath. Completely different wrapper around it.

That gap between the two is where most AI initiatives get into trouble — and where the real work lives.

The reference architecture

Enterprise AI chat has four components regardless of domain or client.

An ingestion layer that takes raw data and turns it into something queryable. A retrieval layer that finds relevant content when a user asks a question. A synthesis layer where the language model turns retrieved content into a coherent answer. And an interface layer where the user actually interacts with the system.

That's it. Every enterprise AI chat system I've built maps to those four components. The technology choices change. The pattern doesn't.

What changes in production

The four components don't change. They never change. What changes is everything specific to your organization — your data, your environment, your requirements. That's where the work actually is.

Data is the first thing. In the reference architecture the data is clean, well-structured, and cooperative. In production it's messy, inconsistently formatted, and spread across systems that were never designed to talk to each other.

That's not a timeline problem. That's a decision problem. Do you build around what you have or fix the data first? Do you scope the initiative to the data that's actually reliable or surface the gaps to the business and let them decide what's worth cleaning? Those decisions have real consequences — scope, cost, and what the system can and can't do on day one.

Most teams don't surface those decisions early enough. They discover the data problem mid-build when the cost of changing direction is highest.

Environment is the second thing. The reference architecture runs on a standard cloud environment. Production might be GovCloud with compliance requirements, air-gapped systems, or legacy infrastructure that doesn't support the tooling you'd normally reach for. The architecture adapts. The constraints don't negotiate.

Requirements are the third thing. In the demo there are no stakeholders, no security reviews, no access control policies, no audit trails. In production there are all of those things plus organizational dynamics you didn't know existed until you were three weeks in.

Same four components. Completely different engagement.

What this means for you

The demo will always work. That's not the question.

The question is whether the person building it has done the real world version — with real data that didn't cooperate, in a real environment with real constraints, for real stakeholders who had opinions.

I've built the clean version. I've built the production version. I know exactly where the gap is and what it costs to close it. Every engagement starts with understanding your data, your environment, and your requirements before a single line of code gets written.

That's not a discovery tax. That's the work that makes everything else land.

Clarity through the chaos.

Ready to talk about what your production version looks like? Reach out at logiclens.io.

Arjun Krishnamoorthi is the founder of LogicLens LLC, a fractional data architecture and AI consulting practice. If you have a data infrastructure problem or an AI project that needs senior hands — let's talk.