Architecture deep-dive
How Orectic and Penumbra each wire vector stores, knowledge graphs, and agents together. They reach the same destination (typed structure under an LLM) from opposite directions. ASCII diagrams below.
Two bets on where structure comes from
Both companies agree that stuffing chunks into an LLM prompt does not scale — you need typed structure. They disagree on where that structure is born.
Orectic — machine extracts the schema
Ingest 17 source types → an extraction engine derives entities and relationships → builds a knowledge graph (748 relationships from one client, per their site) + vector store → an autonomous Oracle agent runs on top. Shipped product. $1,500/mo starting.
Penumbra — humans declare the schema
Your team writes the ontology — objects, rules, workflows, standards — in plain language. Penumbra emits domain objects, agent tools, APIs, extraction, memory, guardrails, provenance, review. Your agents consume it. Research preview, 2026.
Orectic — extract, then act
Many input types funnel into a single extraction engine, which builds the structured layer underneath an autonomous Oracle agent. The graph schema is inferred, not declared.
ORECTIC — BOTTOM-UP EXTRACTION
┌──────────────────────────────────────────────────────────────────────────┐
│ FILES & DATA │
│ │
│ calls docs video contracts slack email spreadsheets │
│ PDFs CRM tickets meetings notes forms recordings │
│ ... 17 source types in total │
└────────────────────────────────────┬─────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ EXTRACTION ENGINE │
│ │
│ audio/video → transcription → NER → relation extraction │
│ text → chunking → embedding → entity resolution → linking │
│ tabular → schema inference → typed records │
└────────────────────────────────────┬─────────────────────────────────────┘
│ derives
┌──────────────────────┴──────────────────────┐
▼ ▼
┌─────────────────────────────┐ ┌─────────────────────────────┐
│ KNOWLEDGE GRAPH │ │ VECTOR STORE │
│ │ │ │
│ entities + typed edges │ │ embedded chunks │
│ "748 relationships from a │ │ for fuzzy retrieval │
│ single client" │ │ alongside the graph │
└─────────────────┬───────────┘ └───────────────┬─────────────┘
│ │
└─────────────────────┬─────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ ORACLE AGENT │
│ │
│ answers questions · makes decisions · takes actions │
│ uses both the graph (precision) and the vector store (recall) │
└──────────────────────────────────────────────────────────────────────────┘
│
▼
autonomous worker, ready to deploy
Key design choices
- Schema is inferred, not declared — extraction picks entity types and edge types from the data. Trade-off: faster to start, harder to enforce business invariants.
- One agent, one product surface — the Oracle is the customer-facing thing. You don't compose your own agents on top.
- Both retrieval styles built in — graph for precision, vector for recall. The Oracle decides which to use per query.
- Priced as a finished product —
$1,500/mostarting suggests productized, packaged.
Penumbra — declare, then build
One declared domain model fans out into the operational components agents need. The schema is authored, then the rest of the substrate is generated from it.
PENUMBRA — TOP-DOWN ONTOLOGY
┌──────────────────────────────────────────────────────────────────────────┐
│ DOMAIN MODEL (your team writes this in plain language) │
│ │
│ objects: Customer, Engagement, Deliverable, Decision, ... │
│ rules: "RFP responses must cite prior decisions" │
│ workflows: intake → triage → expert review → response │
│ standards: what "done" means for each deliverable │
└────────────────────────────────────┬─────────────────────────────────────┘
│ generates
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ GENERATED COMPONENTS │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ objects │ │ tools │ │ APIs │ │ extraction │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ memory │ │guardrails│ │provenance│ │ review │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└────────────────────────────────────┬─────────────────────────────────────┘
│ consumed by
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ YOUR AGENTS & APPS │
│ │
│ built on the typed substrate · act on real business objects │
│ not your problem to build memory, provenance, guardrails from scratch │
└──────────────────────────────────────────────────────────────────────────┘
│
▼
substrate for what you build
Key design choices
- Schema is authored, not extracted — your experts describe how the business actually works. Trade-off: slower to start, captures tacit knowledge that no document holds.
- Platform, not product — Penumbra doesn't ship an agent for you. You ship agents on top of it.
- Two surfaces — Use Penumbra (solve one workflow) for service firms; Build on Penumbra (platform) for product builders.
- Consultative GTM — "book a working session" rather than self-serve pricing. Research preview framing.
Side-by-side
Both diagrams flow top-to-bottom on the page, but the semantic direction is reversed. Orectic is wide-at-top (many inputs) narrowing to one agent. Penumbra is narrow-at-top (one ontology) fanning out to many components.
ORECTIC PENUMBRA
────── ────────
┌──────────────────────┐ ┌──────────────────────┐
│ 17 file types │ │ domain model │
│ calls,docs,video... │ │ objects,rules,... │
└──────────┬───────────┘ └──────────┬───────────┘
│ │ generates
▼ ▼
┌──────────────────────┐ ┌─────────────────────────┐
│ extraction engine │ │ ┌─────┐ ┌─────┐ ┌─────┐ │
│ infers structure │ │ │objs │ │tools│ │APIs │ │
└──────────┬───────────┘ │ └─────┘ └─────┘ └─────┘ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ │
┌──────────┴───────────┐ │ │mem │ │guard│ │prov │ │
▼ ▼ │ └─────┘ └─────┘ └─────┘ │
┌────┐ ┌────────┐ └──────────┬──────────────┘
│ KG │ │ vector │ │
└──┬─┘ └────┬───┘ ▼
└──────────┬───────────┘ ┌──────────────────────┐
▼ │ your agents/apps │
┌──────────────────────┐ │ ground in real domain│
│ Oracle agent │ └──────────────────────┘
│ answers + acts │
└──────────────────────┘
many → one one → many
"we deliver a finished agent" "we deliver the substrate
you build on"
The actual difference
| Dimension | Orectic | Penumbra |
|---|---|---|
| Schema source | inferred by extraction engine | authored by your team |
| Flow direction | files → structure → agent | ontology → components → agents |
| Shape | funnel (many → one) | fan-out (one → many) |
| Product shape | finished Oracle agent | substrate for your own agents |
| Where value lives | in files already on disk | in expert judgment, tacit rules |
| Pricing | from $1,500/mo, public | not public, working sessions |
| Maturity signal | priced + productized | research preview, 2026 |
| Fails when | important truth never written down | nobody has time to model |
Where each one fits in your AI stack
Both products sit between your raw data and your application surfaces. They occupy roughly the same slot — the difference is what they hand off.
THE AI STACK (with both products)
┌──────────────────────────────────────────────────────────────────────────┐
│ application surfaces │
│ chat UI · approval flows · dashboards · internal tools │
└────────────────────────────────────▲─────────────────────────────────────┘
│
consume from below
│
┌────────────────────────────────────┴─────────────────────────────────────┐
│ ORECTIC: an Oracle agent │ PENUMBRA: typed domain substrate │
│ ───────────────────────── │ ────────────────────────────── │
│ one agent endpoint │ objects, tools, APIs, memory, │
│ answers + acts │ guardrails, provenance │
└────────────────────────────────────▲─────────────────────────────────────┘
│
both build on
│
┌────────────────────────────────────┴─────────────────────────────────────┐
│ retrieval primitives │
│ vector store · knowledge graph · semantic search │
└────────────────────────────────────▲─────────────────────────────────────┘
│
┌────────────────────────────────────┴─────────────────────────────────────┐
│ raw data sources │
│ files · calls · email · slack · spreadsheets · contracts · video │
└──────────────────────────────────────────────────────────────────────────┘
A mature org will eventually want both — extraction to capture what's already written, declared ontology to capture what experts know. Today they're separate companies betting on which half is the bottleneck.
Stack & conventions
This site uses the same single-file HTML pattern as the other three. Common CSS variables, ASCII diagrams with highlight spans, sticky sidebar nav.
Deploy & run
CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \ wrangler pages project create orectic-penumbra-arch \ --production-branch main 2>/dev/null || true cd docs/arch CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \ wrangler pages deploy . \ --project-name orectic-penumbra-arch \ --branch main \ --commit-dirty=true