Engineering at Jagatab.UK

Architecture diagrams. Code snippets. Real production patterns.

Most software you buy from a small agency is templated, brittle, and built without the patterns that make production systems survive contact with real users. This page is the opposite: the actual reference architecture we build on, the deployment pipeline we ship through, the observability we wire in, the AI patterns we use, and the code snippets that ship in real engagements. If you're trying to assess whether we're a real engineering practice or just another freelancer with a template, this page is for you.

Reference architecture

This is the default architecture for a production application — a typed back-end, a server-rendered front-end, a relational store with vector search, an AI layer with confidence routing, and background workers for anything long-running. Not every project uses every tier; this is the menu, not the prescription.

Reference architecture

┌──────────────────────────────────────────────────────────┐
│  Browser / Mobile / Email / SMS / WhatsApp clients       │
└──────────────────────┬───────────────────────────────────┘
                       │ HTTPS, HTTP/2, TLS 1.3
                       ▼
┌──────────────────────────────────────────────────────────┐
│  Edge layer — Cloudflare or Vercel CDN                   │
│  • Static caching        • WAF + bot challenge           │
│  • DDoS shield           • Image optimisation            │
│  • TLS termination       • Geo / IP rate-limit           │
└──────────────────────┬───────────────────────────────────┘
                       ▼
┌──────────────────────────────────────────────────────────┐
│  Application tier                                        │
│  ┌────────────────────┐    ┌────────────────────────┐    │
│  │  Next.js front     │◄──►│  FastAPI back-end      │    │
│  │  SSR / ISR / RSC   │    │  Python 3.12, async    │    │
│  │  Tailwind / shadcn │    │  Pydantic v2 schemas   │    │
│  └────────────────────┘    └─────────┬──────────────┘    │
│                                      │                   │
└──────────────────────────────────────┼───────────────────┘
                                       │
        ┌──────────────────────────────┼──────────────────┐
        ▼                              ▼                  ▼
┌────────────────┐         ┌──────────────────┐  ┌────────────────┐
│  PostgreSQL    │         │  AI Layer        │  │  Background    │
│  + pgvector    │◄────────┤  GPT-4o / Claude │  │  workers       │
│  Audited rows  │         │  Local Llama 3   │  │  Celery / RQ   │
└────────┬───────┘         │  Confidence-     │  └────────┬───────┘
         │                 │  scored, RAG     │           │
         ▼                 └──────────────────┘           ▼
┌────────────────┐                                ┌────────────────┐
│  S3 / R2       │                                │  Stripe /      │
│  file store    │                                │  Twilio /      │
│  encrypted     │                                │  Postmark      │
└────────────────┘                                └────────────────┘

Why this shape: it scales from a £5k single-service automation to a £50k multi-tenant SaaS without architectural rewrites. The same audit logging, idempotency, schema validation, and observability patterns apply at every size.

Deployment pipeline

Every commit goes through the same pipeline. No manual ops, no "ship to prod from my laptop". Failed gates block; failed deploys auto-rollback.

CI/CD pipeline

┌────────┐   ┌──────────┐   ┌────────┐   ┌──────────┐   ┌─────────┐   ┌────────┐
│  git   │──►│  CI:     │──►│  Build │──►│ Staging  │──►│  Smoke  │──►│  Prod  │
│  push  │   │  ruff,   │   │ next + │   │ deploy   │   │  tests  │   │ deploy │
│        │   │  pytest, │   │ docker │   │ + seed   │   │  + LH   │   │ + ISR  │
│        │   │  mypy    │   │        │   │ data     │   │  perf   │   │ purge  │
└────────┘   └────┬─────┘   └───┬────┘   └────┬─────┘   └────┬────┘   └────┬───┘
                  │             │             │              │             │
              failed=         failed=        failed=        failed=      failed=
              block PR        block PR     auto-rollback   block prom   auto-rollback
                                                                       + page on-call

What runs in CI

ruff — Python lint (fast, opinionated)
pytest — unit + integration tests with coverage gate (≥80% for changed lines)
mypy — static type-check on new code
Playwright — end-to-end tests on critical flows (login, checkout, lead-capture)
Lighthouse CI — performance budget enforcement (LCP, INP, CLS)

What runs at deploy time

Staging deploy with seeded data, soak for 2-5 minutes
Smoke tests against staging (auth, key endpoints, sample workflows)
Prod deploy with traffic gradually shifted (canary 5% → 25% → 100%) where the infra supports it
Auto-rollback if error rate or latency exceeds threshold for 60 seconds

Observability — three signals into one pane

You can't fix what you can't see. Every production service ships with logs, metrics, and error tracking from day one — not bolted on after the first incident.

Logs / metrics / errors

┌──────────────┐        logs           ┌────────────────────┐
│  Application │──────────────────────►│  CloudWatch /      │
│   process    │                        │  Better Stack      │
│   (FastAPI,  │        metrics        ├────────────────────┤
│   Next.js,   │──────────────────────►│  Prometheus /      │
│   workers)   │                        │  Datadog dashboard │
│              │        errors         ├────────────────────┤
│              │──────────────────────►│  Sentry            │
└──────────────┘                        └─────────┬──────────┘
                                                  │ webhook
                                                  ▼
                                        ┌────────────────────┐
                                        │  Slack #alerts     │
                                        │  PagerDuty on-call │
                                        └────────────────────┘

What we track by default

SLI metrics — request latency (p50, p95, p99), error rate, throughput
Business KPIs — sign-ups, enquiries, conversions, AI calls per tenant
Cost signals — OpenAI / Claude token spend per tenant, DB CPU, S3 egress
Security events — failed auth, rate-limit hits, suspicious patterns

Care plans (see care plans) include the dashboards, alerting, and the on-call response when something fires.

AI engineering — how we engineer around hallucination

The hard part of shipping AI isn't calling the model. It's engineering for the times the model is wrong — and it will be wrong. We use retrieval-augmented generation (RAG) with confidence scoring and explicit human escalation. Demos hide this; production needs it.

RAG flow with confidence routing

┌──────────────┐
│ User query   │
└──────┬───────┘
       ▼
┌──────────────┐       ┌─────────────────┐
│ Embed query  │──────►│ pgvector ANN    │
│ (OpenAI emb) │       │ k=5 retrieve    │
└──────────────┘       └────────┬────────┘
                                ▼
                       ┌─────────────────┐
                       │ Cross-encoder   │
                       │ re-rank (top-3) │
                       └────────┬────────┘
                                ▼
                       ┌─────────────────────────────┐
                       │ LLM call with system prompt:│
                       │   "Answer ONLY from         │
                       │    context. Cite chunk_id.  │
                       │    Say 'I don't know' if    │
                       │    context insufficient."   │
                       └────────┬────────────────────┘
                                ▼
                       ┌─────────────────┐
                       │ Confidence      │     <0.6 ──► human escalation
                       │ score answer    │
                       └────────┬────────┘
                                ▼ ≥0.6
                       ┌─────────────────┐
                       │ Return answer   │
                       │ + cited sources │
                       └─────────────────┘

Why this works

Grounded answers only. The model is instructed to use only the retrieved context. If context is insufficient, it says "I don't know" instead of inventing.
Citations enforced. Every answer points to the chunk it came from. Users can verify. Auditors can replay.
Confidence routing. Low-confidence answers escalate to a human with full context. No silent failures.
Tenant-scoped retrieval. Vector search is filtered by tenant — never leaks data across customers.

Security model — defence in depth

Every layer assumes the layer above it was compromised. Perimeter alone is not enough; identity alone is not enough; data alone is not enough.

Defence-in-depth

                       Perimeter ──┐
   ┌─────────────────────────────────┼──────────────────────────────────┐
   │  • Cloudflare WAF, bot challenge, DDoS, rate-limit                 │
   │  • TLS 1.3 required, HSTS preload, HTTPS-only cookies              │
   └─────────────────────────────────┬──────────────────────────────────┘
                                     ▼
                       Identity ────┐
   ┌─────────────────────────────────┼──────────────────────────────────┐
   │  • OAuth / OIDC (Google, Microsoft, magic-link)                    │
   │  • Per-session CSRF tokens, SameSite=Lax cookies                   │
   │  • Roles + per-endpoint scope checks (least-privilege)             │
   └─────────────────────────────────┬──────────────────────────────────┘
                                     ▼
                       Data ────────┐
   ┌─────────────────────────────────┼──────────────────────────────────┐
   │  • Encrypted at rest (KMS), in transit (TLS), in backup            │
   │  • PII column-level encryption, GDPR retention windows             │
   │  • Tenant isolation in DB + S3 prefix scoping                      │
   └─────────────────────────────────┬──────────────────────────────────┘
                                     ▼
                       Audit ───────┐
   ┌─────────────────────────────────┼──────────────────────────────────┐
   │  • Append-only audit log, 13-month retention                       │
   │  • Sensitive actions double-logged + Slack alert                   │
   │  • Quarterly access review, annual pen-test (where contracted)     │
   └────────────────────────────────────────────────────────────────────┘

For UK GDPR-sensitive engagements we additionally provide: DPA documentation, data inventory mapping, retention policy enforcement, subject access request endpoints, and erasure workflows. See security & compliance for the full posture.

Code we actually ship

Four patterns that show up in nearly every engagement. These aren't pasted from blog posts — they're sanitised versions of code that runs in production.

1. Validated, audited API endpoint

Every endpoint validates input with Pydantic, requires authentication, writes an audit log entry, and queues background work where appropriate. No raw dictionaries, no silent failures, no missing audit trail.

python

from fastapi import FastAPI, Depends, HTTPException
from pydantic import BaseModel, Field, EmailStr

app = FastAPI()

class EnquiryIn(BaseModel):
    name: str = Field(min_length=2, max_length=100)
    email: EmailStr
    phone: str | None = Field(default=None, max_length=20)
    message: str = Field(min_length=10, max_length=2000)

class EnquiryOut(BaseModel):
    id: int
    status: str

class="n">@app.post(class="s">"/api/enquiries", response_model=EnquiryOut, status_code=201)
async def create_enquiry(
    payload: EnquiryIn,
    user=Depends(require_auth),
    db=Depends(get_db),
):
    class="s">""class="s">"Validated, audited, idempotent endpoint."class="s">""
    enquiry = await db.enquiries.create(
        tenant_id=user.tenant_id,
        actor_id=user.id,
        **payload.model_dump(),
    )
    await audit_log(db, action=class="s">"enquiry.created", entity_id=enquiry.id, actor=user)
    await enqueue_lead_qualification(enquiry.id)
    return EnquiryOut(id=enquiry.id, status=class="s">"received")

2. RAG with confidence routing

Vector search, tenant-scoped retrieval, grounded LLM call, confidence scoring, human escalation. The whole flow in one async function.

python

async def answer_with_grounding(question: str, tenant_id: int) -> RagAnswer:
    class="s">""class="s">"RAG with confidence routing and escalation."class="s">""
    embedding = await openai.embeddings.create(
        model=class="s">"text-embedding-3-large",
        input=question,
    )

    class="c"># Vector search scoped to the tenant — never leak cross-tenant data
    chunks = await db.fetch(class="s">"""
        SELECT id, content, source_url
          FROM knowledge_chunks
         WHERE tenant_id = $1
      ORDER BY embedding <-> $2
         LIMIT 5
    class="s">""class="s">", tenant_id, embedding.data[0].embedding)

    if not chunks:
        return RagAnswer(text=None, citations=[], confidence=0.0,
                         escalated=True, reason="no_contextclass="s">")

    response = await openai.chat.completions.create(
        model="gpt-4oclass="s">",
        messages=[
            {"roleclass="s">": "systemclass="s">", "contentclass="s">": GROUNDED_PROMPT},
            {"roleclass="s">": "userclass="s">", "content": format_with_context(question, chunks)},
        ],
        temperature=0.1,
    )

    answer = parse_answer(response)            class="c"># extracts text + cited chunk_ids
    score = score_confidence(answer, chunks)   class="c"># heuristic + cross-encoder

    if score < 0.6:
        await escalate_to_human(question, answer, chunks)
        return RagAnswer(text=None, citations=[], confidence=score,
                         escalated=True, reason=class="s">"low_confidence")

    return RagAnswer(
        text=answer.text,
        citations=answer.cited_chunks,
        confidence=score,
        escalated=False,
    )

3. Idempotent background worker

Retried safely. Locked while processing. Audit-logged on completion. The pattern that prevents customers being charged twice when a webhook retries.

python

from celery import Celery
from tenacity import retry, stop_after_attempt, wait_exponential

celery = Celery(class="s">"workers", broker=class="s">"redis://localhost:6379/0")

class="n">@celery.task(bind=True, max_retries=5)
class="n">@retry(stop=stop_after_attempt(3),
       wait=wait_exponential(multiplier=1, min=2, max=30))
async def qualify_lead(self, enquiry_id: int) -> None:
    class="s">""class="s">"Idempotent: safe to retry. Writes audit-logged outcome."class="s">""
    enquiry = await db.enquiries.get_or_lock(enquiry_id)
    if enquiry.status != class="s">"received":
        return  class="c"># already processed — idempotency

    enriched = await enrich_with_clearbit(enquiry.email)
    score = await score_with_llm(enquiry.message, enriched)
    routing = decide_route(score, enriched.company_size)

    async with db.transaction():
        await db.enquiries.update(enquiry_id, status=class="s">"qualified",
                                  score=score, route=routing)
        await audit_log(action=class="s">"lead.qualified", entity_id=enquiry_id,
                        score=score, route=routing)
        await notify_team(routing.team_id, enquiry, score)

4. Append-only audit log schema

The SQL that makes the audit trail untamperable. UPDATE and DELETE permissions revoked at the database level — not enforced in application code where a future bug could slip past.

sql

CREATE TABLE audit_log (
  id            class="k">BIGSERIAL PRIMARY KEY,
  ts            TIMESTAMPTZ NOT class="k">NULL DEFAULT now(),
  tenant_id     class="k">BIGINT NOT class="k">NULL,
  actor_id      class="k">BIGINT,
  actor_kind    TEXT NOT class="k">NULL,              class="c">-- class="s">"user" | class="s">"service" | class="s">"system"
  action        TEXT NOT class="k">NULL,              class="c">-- e.g. class="s">"enquiry.created"
  entity_kind   TEXT,
  entity_id     class="k">BIGINT,
  payload       JSONB NOT class="k">NULL DEFAULT class="s">'{}',
  ip            INET,
  request_id    UUID,
  CONSTRAINT audit_log_no_update CHECK (true) NO class="k">INHERIT
);

class="c">-- Append-only enforced at DB level: no UPDATE or DELETE permissions granted.
REVOKE UPDATE, DELETE ON audit_log FROM PUBLIC;
CREATE INDEX audit_log_tenant_ts_idx ON audit_log (tenant_id, ts DESC);
CREATE INDEX audit_log_action_idx    ON audit_log (action);

Where this differs from a WordPress + plugins build

	Typical WP + plugins	Engineered application
Schema validation	Often missing — plugins accept whatever	Pydantic / Zod on every input
Audit trail	Plugin-dependent, often empty	Append-only, DB-enforced
Idempotency	Rare — webhooks retry → duplicates	Default — safe retries
Observability	Server logs only, no business metrics	Logs + metrics + errors + KPIs
Deploy pipeline	FTP / file manager / manual	Git-driven, gated, auto-rollback
Security patches	Manual, often skipped	Dependabot + monthly review
Custom workflow	Heavy plugin or workaround	First-class code, owned by you
AI integration	Third-party widget, no grounding	Grounded RAG, confidence-routed
5-year TCO	Hosting + plugins + fix-it bills	Hosting + flat care plan

WordPress is not bad. WordPress + 30 plugins held together with shortcodes is bad, and it's what most SMEs end up with when their site grows beyond a brochure. Engineered applications cost more upfront and less over the lifetime.

Frequently asked questions

Do you work this way for £5k projects too, or only enterprise?

The same patterns scale down. A £5k automation might skip the worker tier and ship as a single FastAPI service, but it still has typed schemas, audit logging, deploy pipeline, and observability. The discipline is the same; the surface area is smaller.

What if our existing stack doesn't fit your reference architecture?

The reference is a default, not a religion. We've integrated with PHP / Laravel back-ends, .NET on Azure, Ruby on Rails, Node legacy systems. The patterns (audit log, idempotent workers, grounded AI, observability) port to any stack.

Why Python instead of Node for the back-end?

Python because: better AI/ML ecosystem (the new SDKs ship Python-first), better data tooling (pandas, polars), better PostgreSQL bindings (asyncpg), and Pydantic v2 is the cleanest schema layer in any language. Node where the rest of the team is Node, or where Next.js full-stack is the right answer.

Do you use AI to write the code?

Selectively. AI is fast at boilerplate (CRUD endpoints, migration files, simple tests), and we use it for those. Architecture, security-sensitive code, the prompts themselves, anything with risk — written by Sree directly, reviewed against the patterns above.

What's the most common engineering mistake you see in client code?

Missing idempotency. A worker fails halfway, retries, and now the customer is charged twice / emailed twice / billed twice. The fix is cheap if you design for it; expensive if you don't. We design for it from day 1.

Can we see real production code?

We're publishing 2-3 sanitised demo repos on github.com/sreejagatab — see /credentials for the current public footprint. Live client code is under NDA; the patterns above are the same patterns those repos ship.

Want to walk through an architecture?

Share the workflow or problem you're thinking about. We'll sketch the right architecture honestly, with trade-offs spelled out.

WhatsApp Sree 07864 880790

Engineering at Jagatab.UK

Reference architecture

Deployment pipeline

What runs in CI

What runs at deploy time

Observability — three signals into one pane

What we track by default

AI engineering — how we engineer around hallucination

Why this works

Security model — defence in depth

Code we actually ship

1. Validated, audited API endpoint

2. RAG with confidence routing

3. Idempotent background worker

4. Append-only audit log schema

Where this differs from a WordPress + plugins build

Frequently asked questions

Related reading

Want to walk through an architecture?