Engineering at Jagatab.UK
Architecture diagrams. Code snippets. Real production patterns.
Most software you buy from a small agency is templated, brittle, and built without the patterns that make production systems survive contact with real users. This page is the opposite: the actual reference architecture we build on, the deployment pipeline we ship through, the observability we wire in, the AI patterns we use, and the code snippets that ship in real engagements. If you're trying to assess whether we're a real engineering practice or just another freelancer with a template, this page is for you.
Reference architecture
This is the default architecture for a production application — a typed back-end, a server-rendered front-end, a relational store with vector search, an AI layer with confidence routing, and background workers for anything long-running. Not every project uses every tier; this is the menu, not the prescription.
Reference architecture
┌──────────────────────────────────────────────────────────┐
│ Browser / Mobile / Email / SMS / WhatsApp clients │
└──────────────────────┬───────────────────────────────────┘
│ HTTPS, HTTP/2, TLS 1.3
▼
┌──────────────────────────────────────────────────────────┐
│ Edge layer — Cloudflare or Vercel CDN │
│ • Static caching • WAF + bot challenge │
│ • DDoS shield • Image optimisation │
│ • TLS termination • Geo / IP rate-limit │
└──────────────────────┬───────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Application tier │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ Next.js front │◄──►│ FastAPI back-end │ │
│ │ SSR / ISR / RSC │ │ Python 3.12, async │ │
│ │ Tailwind / shadcn │ │ Pydantic v2 schemas │ │
│ └────────────────────┘ └─────────┬──────────────┘ │
│ │ │
└──────────────────────────────────────┼───────────────────┘
│
┌──────────────────────────────┼──────────────────┐
▼ ▼ ▼
┌────────────────┐ ┌──────────────────┐ ┌────────────────┐
│ PostgreSQL │ │ AI Layer │ │ Background │
│ + pgvector │◄────────┤ GPT-4o / Claude │ │ workers │
│ Audited rows │ │ Local Llama 3 │ │ Celery / RQ │
└────────┬───────┘ │ Confidence- │ └────────┬───────┘
│ │ scored, RAG │ │
▼ └──────────────────┘ ▼
┌────────────────┐ ┌────────────────┐
│ S3 / R2 │ │ Stripe / │
│ file store │ │ Twilio / │
│ encrypted │ │ Postmark │
└────────────────┘ └────────────────┘
Why this shape: it scales from a £5k single-service automation to a £50k multi-tenant SaaS without architectural rewrites. The same audit logging, idempotency, schema validation, and observability patterns apply at every size.
Deployment pipeline
Every commit goes through the same pipeline. No manual ops, no "ship to prod from my laptop". Failed gates block; failed deploys auto-rollback.
CI/CD pipeline
┌────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐
│ git │──►│ CI: │──►│ Build │──►│ Staging │──►│ Smoke │──►│ Prod │
│ push │ │ ruff, │ │ next + │ │ deploy │ │ tests │ │ deploy │
│ │ │ pytest, │ │ docker │ │ + seed │ │ + LH │ │ + ISR │
│ │ │ mypy │ │ │ │ data │ │ perf │ │ purge │
└────────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └────┬────┘ └────┬───┘
│ │ │ │ │
failed= failed= failed= failed= failed=
block PR block PR auto-rollback block prom auto-rollback
+ page on-call
What runs in CI
- ruff — Python lint (fast, opinionated)
- pytest — unit + integration tests with coverage gate (≥80% for changed lines)
- mypy — static type-check on new code
- Playwright — end-to-end tests on critical flows (login, checkout, lead-capture)
- Lighthouse CI — performance budget enforcement (LCP, INP, CLS)
What runs at deploy time
- Staging deploy with seeded data, soak for 2-5 minutes
- Smoke tests against staging (auth, key endpoints, sample workflows)
- Prod deploy with traffic gradually shifted (canary 5% → 25% → 100%) where the infra supports it
- Auto-rollback if error rate or latency exceeds threshold for 60 seconds
Observability — three signals into one pane
You can't fix what you can't see. Every production service ships with logs, metrics, and error tracking from day one — not bolted on after the first incident.
Logs / metrics / errors
┌──────────────┐ logs ┌────────────────────┐
│ Application │──────────────────────►│ CloudWatch / │
│ process │ │ Better Stack │
│ (FastAPI, │ metrics ├────────────────────┤
│ Next.js, │──────────────────────►│ Prometheus / │
│ workers) │ │ Datadog dashboard │
│ │ errors ├────────────────────┤
│ │──────────────────────►│ Sentry │
└──────────────┘ └─────────┬──────────┘
│ webhook
▼
┌────────────────────┐
│ Slack #alerts │
│ PagerDuty on-call │
└────────────────────┘
What we track by default
- SLI metrics — request latency (p50, p95, p99), error rate, throughput
- Business KPIs — sign-ups, enquiries, conversions, AI calls per tenant
- Cost signals — OpenAI / Claude token spend per tenant, DB CPU, S3 egress
- Security events — failed auth, rate-limit hits, suspicious patterns
Care plans (see care plans) include the dashboards, alerting, and the on-call response when something fires.
AI engineering — how we engineer around hallucination
The hard part of shipping AI isn't calling the model. It's engineering for the times the model is wrong — and it will be wrong. We use retrieval-augmented generation (RAG) with confidence scoring and explicit human escalation. Demos hide this; production needs it.
RAG flow with confidence routing
┌──────────────┐
│ User query │
└──────┬───────┘
▼
┌──────────────┐ ┌─────────────────┐
│ Embed query │──────►│ pgvector ANN │
│ (OpenAI emb) │ │ k=5 retrieve │
└──────────────┘ └────────┬────────┘
▼
┌─────────────────┐
│ Cross-encoder │
│ re-rank (top-3) │
└────────┬────────┘
▼
┌─────────────────────────────┐
│ LLM call with system prompt:│
│ "Answer ONLY from │
│ context. Cite chunk_id. │
│ Say 'I don't know' if │
│ context insufficient." │
└────────┬────────────────────┘
▼
┌─────────────────┐
│ Confidence │ <0.6 ──► human escalation
│ score answer │
└────────┬────────┘
▼ ≥0.6
┌─────────────────┐
│ Return answer │
│ + cited sources │
└─────────────────┘
Why this works
- Grounded answers only. The model is instructed to use only the retrieved context. If context is insufficient, it says "I don't know" instead of inventing.
- Citations enforced. Every answer points to the chunk it came from. Users can verify. Auditors can replay.
- Confidence routing. Low-confidence answers escalate to a human with full context. No silent failures.
- Tenant-scoped retrieval. Vector search is filtered by tenant — never leaks data across customers.
Security model — defence in depth
Every layer assumes the layer above it was compromised. Perimeter alone is not enough; identity alone is not enough; data alone is not enough.
Defence-in-depth
Perimeter ──┐
┌─────────────────────────────────┼──────────────────────────────────┐
│ • Cloudflare WAF, bot challenge, DDoS, rate-limit │
│ • TLS 1.3 required, HSTS preload, HTTPS-only cookies │
└─────────────────────────────────┬──────────────────────────────────┘
▼
Identity ────┐
┌─────────────────────────────────┼──────────────────────────────────┐
│ • OAuth / OIDC (Google, Microsoft, magic-link) │
│ • Per-session CSRF tokens, SameSite=Lax cookies │
│ • Roles + per-endpoint scope checks (least-privilege) │
└─────────────────────────────────┬──────────────────────────────────┘
▼
Data ────────┐
┌─────────────────────────────────┼──────────────────────────────────┐
│ • Encrypted at rest (KMS), in transit (TLS), in backup │
│ • PII column-level encryption, GDPR retention windows │
│ • Tenant isolation in DB + S3 prefix scoping │
└─────────────────────────────────┬──────────────────────────────────┘
▼
Audit ───────┐
┌─────────────────────────────────┼──────────────────────────────────┐
│ • Append-only audit log, 13-month retention │
│ • Sensitive actions double-logged + Slack alert │
│ • Quarterly access review, annual pen-test (where contracted) │
└────────────────────────────────────────────────────────────────────┘
For UK GDPR-sensitive engagements we additionally provide: DPA documentation, data inventory mapping, retention policy enforcement, subject access request endpoints, and erasure workflows. See security & compliance for the full posture.
Code we actually ship
Four patterns that show up in nearly every engagement. These aren't pasted from blog posts — they're sanitised versions of code that runs in production.
1. Validated, audited API endpoint
Every endpoint validates input with Pydantic, requires authentication, writes an audit log entry, and queues background work where appropriate. No raw dictionaries, no silent failures, no missing audit trail.
python
from fastapi import FastAPI, Depends, HTTPException from pydantic import BaseModel, Field, EmailStr app = FastAPI() class EnquiryIn(BaseModel): name: str = Field(min_length=2, max_length=100) email: EmailStr phone: str | None = Field(default=None, max_length=20) message: str = Field(min_length=10, max_length=2000) class EnquiryOut(BaseModel): id: int status: str class="n">@app.post(class="s">"/api/enquiries", response_model=EnquiryOut, status_code=201) async def create_enquiry( payload: EnquiryIn, user=Depends(require_auth), db=Depends(get_db), ): class="s">""class="s">"Validated, audited, idempotent endpoint."class="s">"" enquiry = await db.enquiries.create( tenant_id=user.tenant_id, actor_id=user.id, **payload.model_dump(), ) await audit_log(db, action=class="s">"enquiry.created", entity_id=enquiry.id, actor=user) await enqueue_lead_qualification(enquiry.id) return EnquiryOut(id=enquiry.id, status=class="s">"received")
2. RAG with confidence routing
Vector search, tenant-scoped retrieval, grounded LLM call, confidence scoring, human escalation. The whole flow in one async function.
python
async def answer_with_grounding(question: str, tenant_id: int) -> RagAnswer: class="s">""class="s">"RAG with confidence routing and escalation."class="s">"" embedding = await openai.embeddings.create( model=class="s">"text-embedding-3-large", input=question, ) class="c"># Vector search scoped to the tenant — never leak cross-tenant data chunks = await db.fetch(class="s">""" SELECT id, content, source_url FROM knowledge_chunks WHERE tenant_id = $1 ORDER BY embedding <-> $2 LIMIT 5 class="s">""class="s">", tenant_id, embedding.data[0].embedding) if not chunks: return RagAnswer(text=None, citations=[], confidence=0.0, escalated=True, reason="no_contextclass="s">") response = await openai.chat.completions.create( model="gpt-4oclass="s">", messages=[ {"roleclass="s">": "systemclass="s">", "contentclass="s">": GROUNDED_PROMPT}, {"roleclass="s">": "userclass="s">", "content": format_with_context(question, chunks)}, ], temperature=0.1, ) answer = parse_answer(response) class="c"># extracts text + cited chunk_ids score = score_confidence(answer, chunks) class="c"># heuristic + cross-encoder if score < 0.6: await escalate_to_human(question, answer, chunks) return RagAnswer(text=None, citations=[], confidence=score, escalated=True, reason=class="s">"low_confidence") return RagAnswer( text=answer.text, citations=answer.cited_chunks, confidence=score, escalated=False, )
3. Idempotent background worker
Retried safely. Locked while processing. Audit-logged on completion. The pattern that prevents customers being charged twice when a webhook retries.
python
from celery import Celery from tenacity import retry, stop_after_attempt, wait_exponential celery = Celery(class="s">"workers", broker=class="s">"redis://localhost:6379/0") class="n">@celery.task(bind=True, max_retries=5) class="n">@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=30)) async def qualify_lead(self, enquiry_id: int) -> None: class="s">""class="s">"Idempotent: safe to retry. Writes audit-logged outcome."class="s">"" enquiry = await db.enquiries.get_or_lock(enquiry_id) if enquiry.status != class="s">"received": return class="c"># already processed — idempotency enriched = await enrich_with_clearbit(enquiry.email) score = await score_with_llm(enquiry.message, enriched) routing = decide_route(score, enriched.company_size) async with db.transaction(): await db.enquiries.update(enquiry_id, status=class="s">"qualified", score=score, route=routing) await audit_log(action=class="s">"lead.qualified", entity_id=enquiry_id, score=score, route=routing) await notify_team(routing.team_id, enquiry, score)
4. Append-only audit log schema
The SQL that makes the audit trail untamperable. UPDATE and DELETE permissions revoked at the database level — not enforced in application code where a future bug could slip past.
sql
CREATE TABLE audit_log ( id class="k">BIGSERIAL PRIMARY KEY, ts TIMESTAMPTZ NOT class="k">NULL DEFAULT now(), tenant_id class="k">BIGINT NOT class="k">NULL, actor_id class="k">BIGINT, actor_kind TEXT NOT class="k">NULL, class="c">-- class="s">"user" | class="s">"service" | class="s">"system" action TEXT NOT class="k">NULL, class="c">-- e.g. class="s">"enquiry.created" entity_kind TEXT, entity_id class="k">BIGINT, payload JSONB NOT class="k">NULL DEFAULT class="s">'{}', ip INET, request_id UUID, CONSTRAINT audit_log_no_update CHECK (true) NO class="k">INHERIT ); class="c">-- Append-only enforced at DB level: no UPDATE or DELETE permissions granted. REVOKE UPDATE, DELETE ON audit_log FROM PUBLIC; CREATE INDEX audit_log_tenant_ts_idx ON audit_log (tenant_id, ts DESC); CREATE INDEX audit_log_action_idx ON audit_log (action);
Where this differs from a WordPress + plugins build
| Typical WP + plugins | Engineered application | |
|---|---|---|
| Schema validation | Often missing — plugins accept whatever | Pydantic / Zod on every input |
| Audit trail | Plugin-dependent, often empty | Append-only, DB-enforced |
| Idempotency | Rare — webhooks retry → duplicates | Default — safe retries |
| Observability | Server logs only, no business metrics | Logs + metrics + errors + KPIs |
| Deploy pipeline | FTP / file manager / manual | Git-driven, gated, auto-rollback |
| Security patches | Manual, often skipped | Dependabot + monthly review |
| Custom workflow | Heavy plugin or workaround | First-class code, owned by you |
| AI integration | Third-party widget, no grounding | Grounded RAG, confidence-routed |
| 5-year TCO | Hosting + plugins + fix-it bills | Hosting + flat care plan |
WordPress is not bad. WordPress + 30 plugins held together with shortcodes is bad, and it's what most SMEs end up with when their site grows beyond a brochure. Engineered applications cost more upfront and less over the lifetime.
Frequently asked questions
Do you work this way for £5k projects too, or only enterprise?
The same patterns scale down. A £5k automation might skip the worker tier and ship as a single FastAPI service, but it still has typed schemas, audit logging, deploy pipeline, and observability. The discipline is the same; the surface area is smaller.
What if our existing stack doesn't fit your reference architecture?
The reference is a default, not a religion. We've integrated with PHP / Laravel back-ends, .NET on Azure, Ruby on Rails, Node legacy systems. The patterns (audit log, idempotent workers, grounded AI, observability) port to any stack.
Why Python instead of Node for the back-end?
Python because: better AI/ML ecosystem (the new SDKs ship Python-first), better data tooling (pandas, polars), better PostgreSQL bindings (asyncpg), and Pydantic v2 is the cleanest schema layer in any language. Node where the rest of the team is Node, or where Next.js full-stack is the right answer.
Do you use AI to write the code?
Selectively. AI is fast at boilerplate (CRUD endpoints, migration files, simple tests), and we use it for those. Architecture, security-sensitive code, the prompts themselves, anything with risk — written by Sree directly, reviewed against the patterns above.
What's the most common engineering mistake you see in client code?
Missing idempotency. A worker fails halfway, retries, and now the customer is charged twice / emailed twice / billed twice. The fix is cheap if you design for it; expensive if you don't. We design for it from day 1.
Can we see real production code?
We're publishing 2-3 sanitised demo repos on github.com/sreejagatab — see /credentials for the current public footprint. Live client code is under NDA; the patterns above are the same patterns those repos ship.
Related reading
Want to walk through an architecture?
Share the workflow or problem you're thinking about. We'll sketch the right architecture honestly, with trade-offs spelled out.