A schema-stable substrate that 27 browser demos and 259 AINumbers tools already export to. Every tool emits AP2 v1.0 mandates and ships an MCP tool definition — drop them into a runtime and the policy boundary is mostly done. The agent reads the mandate, the runtime enforces it, the audit log writes itself.
Built for Anthropic · Google AP2 · OpenAI function-calling · Vellum · LangChain · reviewable by anyone shipping an agent that touches payments
Cards are workflow-ordered, not alphabetical. The four core demos build a complete runtime policy stack; the two cross-links anchor it to the regulatory and asset surface the agent will eventually touch.
Visualise any AP2 v1.0 policy mandate, surface its intent envelope, compile to runtime guards, emit MCP tool definitions — entirely client-side. This is the design surface for the runtime team that owns the policy boundary.
Load any AP2 mandate the showcase emits (AML / BaaS / DORA / Fraud / Stablecoin),
queue a candidate agent action, see the per-step execution trace with decision and
reasoning. Trace exports as @ainumbers.co/sandbox-trace-v1 — the audit
artifact the pre-production EU-AI-Act / DORA Art. 25–26 testing record needs.
Compose AP2 mandates for agentic AML — customer risk tiers, TM rule parameters,
sanctions cadence, SAR triggers, FATF R.16 travel rule. Validates against
@ainumbers.co/aml-mandate-v1 and emits an MCP tool definition the
runtime can bind to.
Programme-policy slots, sponsor-bank approval gates, BIN-tier issuance ceilings, transaction-rail per-tx caps and handoff workflows — composed into a machine-readable AP2 mandate the runtime enforces and the sponsor bank can audit. OCC Bull. 2023-17 aligned.
Maps financial-services AI use cases (most agent runtime products) to the EU AI Act Annex III risk classes. €35M / 7% turnover exposure — the regulatory boundary the runtime sits inside.
Agents that route tokenised real-world-asset payments need MiCA + Howey classification, pathway verdict, and custody-disclosure visibility before they sign the mandate. This is that check — 8 asset classes × 7 jurisdictions.
Heuristic guardrails fail audit. Bespoke per-tenant policy doesn't scale. An off-the-shelf compliance vendor adds a service call to every agent action — and its output isn't schema-stable enough for the runtime to enforce deterministically.
The AINumbers + Post Oak Labs surface ships ~259 stateless, browser-side tools that already emit AP2 v1.0 mandates, with an MCP tool definition on each. The runtime imports the manifest, the agent reads the mandate, the runtime enforces it. The audit log is the same JSON the agent received. No vendor call. No serialization gap. No bespoke policy DSL.
The six demos in this hub walk an agent team through that adoption: design the policy surface, sandbox the mandate, then apply it across the two most-litigated fintech use cases (AML and BaaS). The cross-links pin the policy boundary to its actual regulatory and asset context.
Each demo is a packaged surface; the underlying AINumbers tools are where the runtime team will go to dig in. AP2 schemas and MCP manifests are exposed per-tool.
The Tool Chain Composer renders the AINumbers catalog as a DAG with the 33 showcase demos highlighted. The agentic-runtime cluster is one connected sub-graph inside it.
We design the deterministic AP2 / MCP policy layer that runtimes like this one depend on. If you're putting agents anywhere near money, let's pressure-test your mandate architecture.
Talk to our team →