Native DRE

01 · residual

Live

Residual Injection (self-hosted)

For models where DRE owns the inference path, a compiled steering vector is added to the residual stream at a chosen layer with a coefficient alpha. The vector is not a universal abstraction — it requires model-specific evals, layer selection, alpha tuning, and a contrast-pair corpus per policy. Feasible via PyTorch forward hooks or llama.cpp hidden-state access.

Enforcement surface

Hidden-state hook at a target layer

Per request

One vector-add inside the forward pass

Strengths

Low marginal per-request overhead above base inference
Runs entirely inside the enterprise VPC
Deterministic given a fixed model, policy text, and compile config

Tradeoffs

Requires an open-weight model and a self-hosted inference path
Adapter is model-family specific (arch × tokenizer)
Vector quality depends on per-model evals, not on policy.md alone

Models

Llama 3.x (llama.cpp)Qwen 2.5MistralGemma

02 · vertex

Live

Vertex Gemini Reasoner

For customers running Gemini on Vertex, DRE compiles policy.md into a Vertex guardrail config plus a structured reasoning contract the model is asked to satisfy. The DRE judge checks the contract output before any tool call is admitted.

Enforcement surface

Vertex guardrail + structured reasoning contract

Per request

Guardrail eval during a single inference

Strengths

Low latency inside Google Cloud deployments
Native data residency per region
Works without model surgery

Tradeoffs

Bounded by Vertex guardrail tier pricing
Contract enforcement is softer than a hidden-state hook

Models

Gemini 2.x ProGemini 2.x Flash

03 · bedrock

Beta

Bedrock Guardrail

DRE compiles policy.md into a Bedrock guardrail config plus a system-prompt digest. The guardrail evaluates input and output; the digest anchors the model's behavior. The audit chain links action → claim → policy text + digest hash, not a residual vector.

Enforcement surface

Managed guardrail + system-prompt digest

Per request

One guardrail evaluation + one inference

Strengths

Reaches Claude via Bedrock and Llama via Bedrock without model surgery
Inherits cloud provider content safety baseline

Tradeoffs

Per-request cost bounded by guardrail tier pricing
Less expressive than a self-hosted hook for nuanced judgment
Enforcement is at the claim boundary, not inside the model

Models

Claude on BedrockLlama on Bedrock

04 · openai

Beta

OpenAI Responses + Judge

The closed API generates the response under a DRE-compiled system preamble. A small open-weight model (governed by the self-hosted adapter) runs as a judge over risky claim classes before a tool call is admitted. The judge is how DRE enforces policy when the frontier model exposes no hook.

Enforcement surface

Compiled system preamble + open-weight judge sidecar

Per request

Preamble tokens + judge sidecar call on risky classes

Strengths

Keeps reasoning quality of the closed frontier model
Judge sidecar carries the enforceable part of the policy
Judge sampling rate is tunable per risk class

Tradeoffs

Two inference calls for risky action classes
Judge behavior must be monitored separately
No hidden-state hook — enforcement is at the claim boundary

Models

gpt-4.1gpt-4oo4-mini

05 · anthropic

Planned

Anthropic Direct + Judge

Planned. Compiled policy would emit a Claude tool-use schema whose shape enforces disclosure tokens and claim fields. The judge sidecar would verify claims against the policy text before the tool executes. No shipping code yet.

Enforcement surface

Tool-use schema + open-weight judge sidecar

Per request

Schema-enforced tool call + judge sidecar

Strengths

Leverages Claude's structured tool-use surface
Claim boundary is the enforcement boundary

Tradeoffs

Schema must be regenerated on policy change
Two calls for higher-risk action classes
Not shipping today

Models

Claude 3.5 SonnetClaude 3.5 Haiku

One policy.
Multiple enforcement surfaces.

Residual Injection (self-hosted)

Vertex Gemini Reasoner

Bedrock Guardrail

OpenAI Responses + Judge

Anthropic Direct + Judge

The policy text is the source of truth. Every adapter links action → claim → policy text a human authored.

One policy.Multiple enforcement surfaces.

Residual Injection (self-hosted)

Vertex Gemini Reasoner

Bedrock Guardrail

OpenAI Responses + Judge

Anthropic Direct + Judge

The policy text is the source of truth. Every adapter links action → claim → policy text a human authored.

One policy.
Multiple enforcement surfaces.