SHAPE · 02 · RAG
RAG pipelines.
Retrieval-augmented generation that holds up when the corpus changes, when the questions get adversarial, and when the team stops watching. Eval harness from day one. No vibes.
- ship
- 4–6 weeks
- best for
- docs, contracts, knowledge bases
- engine
- Claude · embeddings · reranker
- handoff
- repo · keys · evals · ingest pipeline
the spec
What we actually ship.
01 / 04
01 · RAG
Ingest that scales
Chunking, embeddings, metadata — we design for your corpus shape, not a generic PDF pipeline.
02 · RAG
Retrieval that's tested
Golden-set evals before the first demo. You see recall and precision numbers, not adjectives.
03 · RAG
Grounded answers
Claude answers with citations back to the source. When it doesn't know, it says so.
04 · RAG
Re-ingest on a schedule
Corpora change. The pipeline knows it. Scheduled re-indexing, delta evals, alerts on drift.
architecture
How it fits together.
The skeleton we reach for first. We bend it to your stack; we don't bend your stack to it.
02 / 04
┌───────────┐ ┌─────────────┐
│ CORPUS │ ────▶ │ CHUNK+EMBED│
│ (s3/docs) │ │ pipeline │
└───────────┘ └──────┬──────┘
▼
┌────────────┐
│ VECTOR DB │
│ + keyword │
└──────┬─────┘
[ user query ] ──▶ [ RERANK ] ──┘
│
▼
┌──────────────┐ ┌──────────────┐
│ CLAUDE API │ ◀──▶ │ EVAL HARNESS │
│ cite+answer │ │ golden-set │
└──────────────┘ └──────────────┘
fit
Who this is for.
03 / 04
FIT · YES
You are —
- You have a real corpus — docs, contracts, tickets, case law — and people asking it questions all day.
- Hallucinations are unacceptable. Citations are table stakes.
- You want the eval harness in your repo, not ours.
FIT · NO
You are not —
- Teams that want a chatbot sitting on top of a wiki with no eval plan.
- Corpora smaller than ~100 documents — just put them in the context window.
- Situations where 'close enough' is fine. RAG is not the right shape.
→ open a line
Book a call.
30 minutes. We'll tell you if RAG is the right shape, or if it isn't.
04 / 04