drzero
otc-awesome-llm is the Optum LLM library providing version-controlled prompts, chatmodes, instructions, and agent modes for infrastructure operations via native IDE integrations
By Thomas Hudak ([email protected])
Plugin Structure
Installation
Install this plugin using the Claude Code CLI:
claude plugin install drzero@otc-awesome-llmVerification
After installation, verify the plugin is loaded:
claude plugin listDocumentation
Dr. Zero - Autonomous Repository Improvement Plugin
Status: Experimental Alpha (Internal Use Only) Version: 10.17.1
Dr. Zero is a Claude Code plugin providing autonomous repository improvement through dual-scoring curriculum learning and multi-agent swarm coordination. Scoring is based on the GRPO/HRPO framework (arXiv:2601.07055):
- HRPO scores the proposer (format compliance + difficulty calibration, reward
(0.5 * format + difficulty) / 1.5โ rescaled to [0, 1] from paper Eq. 4) - GRPO scores the solver (binary acceptance-test reward)
Full user documentation lives under docs/plugins/drzero/ (see Documentation below for the quadrant map).
Quick Start
Installation
# Add otc-awesome-llm marketplace
claude plugin marketplace add /path/to/otc-awesome-llm
# Install Dr. Zero plugin
claude plugin install drzero@otc-awesome-llm
# Verify installation
claude plugin list
Basic Usage
# Run autonomous improvement (default: 3 iterations, 3 tasks/iteration)
/drzero:drzero
# Health check (confirm plugin is loaded)
/drzero:drzero-ping
# Check session status
/drzero:drzero-status
# View/edit configuration
/drzero:drzero-config
Swarm Coordination Commands
# Hierarchical coordination (Orchestrator + 16 domain agents)
/drzero:drzero-swarm "Implement user authentication"
# Democratic debate for architecture decisions
/drzero:drzero-council "Should we use microservices or monolith?"
# Centralized governance with mandatory quality gates
/drzero:drzero-citadel "Deploy payment processing changes"
# Peer-to-peer parallel work (no central orchestrator)
/drzero:drzero-unity "Fix all linting errors"
# Simplified execution (or ruthless optimization with --evil)
/drzero:drzero-morty "Update README"
# Parallel variant implementations for A/B comparison
/drzero:drzero-cronenberg "Try 3 different approaches to caching"
# Cross-repo coordination
/drzero:drzero-portal-gun "Update authentication across all microservices"
# Minimal viable solutions under extreme constraints
/drzero:drzero-pickle "Minimal changes to pass CI"
Swarm Modes at a Glance
The 8 coordination modes, each shipped as its own command (full contracts in the command reference):
| Mode | Command | Coordination model | Best for |
|---|---|---|---|
| Swarm | /drzero:drzero-swarm | Hierarchical (orchestrator + 16 specialists) | Clear tasks needing parallel domain expertise |
| Council | /drzero:drzero-council | Democratic debate among orchestrator variants | Architecture decisions, design trade-offs |
| Citadel | /drzero:drzero-citadel | Centralized governance, mandatory quality gates | Production deploys, compliance-bound changes |
| Unity | /drzero:drzero-unity | Peer-to-peer, no central orchestrator | Embarrassingly parallel work (lint, renames) |
| Morty | /drzero:drzero-morty | Single agent, no routing (--evil for aggressive optimization) | Obvious or mechanical changes |
| Cronenberg | /drzero:drzero-cronenberg | N parallel variant implementations, compare and pick | A/B testing competing approaches |
| Portal Gun | /drzero:drzero-portal-gun | Cross-repo coordinator | Multi-repository changes, dependency rollouts |
| Pickle | /drzero:drzero-pickle | Single constrained implementer | Minimal solutions in locked-down environments |
Two-Phase Architecture
flowchart TD
user([User / autonomous scan]) -->|prompt or repo findings| proposer
subgraph phase1 ["Phase 1 โ Dual-Scoring Curriculum Refinement"]
proposer[Proposer agent] -->|WorkItems| solver[Solver agent]
solver -->|"attempt facts (exit codes, diffs)"| stopHook[SubagentStop hook]
stopHook -->|"dr0.scoring: HRPO proposer + GRPO solver"| session[("/tmp/drzero_session.json")]
session -->|"success rate vs ~50% target"| proposer
end
preHook[PreToolUse hook] -.->|"validates scope_boundary + acceptance_test"| solver
session -->|refined, domain-tagged prompts| orchestrator
subgraph phase2 ["Phase 2 โ Orchestrator Agent Swarm"]
orchestrator[Orchestration dispatcher] -->|domain routing| specialists["16 domain specialist agents"]
specialists --> gates[Quality gates / security review]
gates --> results[Tested changes + PR]
end
Scoring in both phases is computed exclusively by the dr0 Python package via the
SubagentStop hook โ agents never self-report scores. Deep-dive diagrams (HRPO loop,
swarm sequence, checkpoint lifecycle) live in
dr0/docs/architecture-diagrams.md.
Phase 1: Dual-Scoring Curriculum Learning (arXiv:2601.07055)
- Proposer analyzes CI failures, lint errors, type errors, and documentation gaps to generate WorkItems (improvement tasks).
- Solver attempts each WorkItem, producing code patches and running acceptance tests.
- HRPO scores the proposer: evaluates format compliance and difficulty calibration
of the generated WorkItems using the reward
(0.5 * format + difficulty) / 1.5(rescaled to [0, 1] from paper Eq. 4). - GRPO scores the solver: binary acceptance-test success, the repository-task proxy for exact-match reward.
- Difficulty auto-adjusts to target a 50% success rate.
- Output: Refined, high-quality prompts for Phase 2.
Anti-hallucination rule: All GRPO and HRPO scores are computed exclusively by the
dr0 Python package. Scores are never self-reported by agents.
Phase 2: Agent Swarm Execution
- Orchestrator (Rick Sanchez) coordinates 16 domain specialist agents via Task tool.
- Domain agents execute domain-specific tasks (potentially in parallel).
- Security/Cerberus reviews code through three lenses: security, quality, savage.
- Output: Production-ready changes with tests passing.
Session State
${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json-- per-session transient state, written by hooks during execution.<id>is$DRZERO_SESSION_IDif set, otherwise the hook process PID. Thedrzero/directory is per-user, mode 0700, and ownership-checked (CR-1)..drzero-state.json-- persistent validated state, survives across sessions.
Configuration
Create drzero.yml in project root or ~/.claude/drzero.yml for user defaults.
The drzero.yml format is shared between Claude and Codex, but several
values are runtime-sensitive. The example below uses Claude-oriented values
(backend: claude, orchestrator rick, quality reviewer cerberus). Codex
runs use backend: codex with the orchestration/security domain
identifiers (see plugins/drzero/assets/drzero.yml.example).
Config precedence (both runtimes):
./drzero.yml~/.claude/drzero.yml- built-in defaults
version: '1.0'
dr_zero:
max_iterations: 5
tasks_per_iteration: 3
proposer:
persona: healthcare # conservative, patient-safety focused
solver:
backend: claude # Claude runtime; Codex examples use `codex`
temperature: 0.7
terminal:
all_tests_pass: true
lint_clean: true
coverage_threshold: 80
agent_swarm:
orchestrator:
agent: rick # Claude orchestrator; Codex uses `orchestration`
quality_reviewer: cerberus # Claude reviewer; Codex uses `security`
definition_of_done:
- tests_pass
- lint_clean
- docs_updated
- security_cleared
All 20 Agents (16 Domain Specialists + 4 Meta/Bridge Agents)
The plugin ships exactly 20 agent files โ 16 domain specialists, the orchestration coordinator, the two Phase 1 curriculum-learning agents, and one bridge agent. The contract is defined in agents/CLAUDE.md (a directory-conventions document, not an agent) and enforced by tests; installation bundles are declared in agents-manifest.json.
Dr. Zero uses a domain-filename convention: each agent is named by its domain
({domain}.md), enabling SDK-native precedence resolution without custom discovery logic.
Meta and Bridge Agents (4)
| Domain | Agent File | Role |
|---|---|---|
| orchestration | orchestration.md | Orchestrator (Rick Sanchez) -- coordinates Phase 2 agent swarm; never a work domain |
| proposer | proposer.md | Generates WorkItems in Phase 1 (scored by HRPO) |
| solver | solver.md | Attempts WorkItems in Phase 1 (scored by GRPO) |
| security-remediation | security-remediation.md | Bridge agent: secplat findings โ portal-gun cross-repo remediation |
Domain Specialists (16)
| Domain | Agent File | Focus Area |
|---|---|---|
| architecture | architecture.md | System design, domain boundaries, interfaces, rollout strategy |
| backend | backend.md | REST/GraphQL APIs, microservices, message queues, caching |
| compliance | compliance.md | HIPAA, SOC2, PCI, policy-as-code |
| database | database.md | PostgreSQL, MongoDB, query optimization |
| devops | devops.md | CI/CD pipelines, automation, tooling |
| documentation | documentation.md | MkDocs, Diataxis framework |
| frontend | frontend.md | React, Vue, Angular, Storybook |
| gitops | gitops.md | Semantic-release, workflow distribution across 130+ repos |
| implementation | implementation.md | Production code (Ansible, Terraform, GitHub Actions, Python) |
| infrastructure | infrastructure.md | IaC module structure, environment baselines, deployment topology |
| monitoring | monitoring.md | Dynatrace, Splunk, Azure Monitor |
| networking | networking.md | VPC/VNET, DNS, load balancers, CDN, security groups, routing |
| performance | performance.md | Load testing, caching strategies, profiling, bottleneck analysis |
| secrets | secrets.md | Consul Vault, CyberArk, Venafi certificates, credential rotation |
| security | security.md | Three-headed review (security, quality, savage) -- Cerberus |
| testing | testing.md | Molecule, terratest, pytest, GitHub Actions validation |
Key insight: Filename = domain = agent name. The SDK resolves precedence by filename match alone -- no domain registry or custom discovery needed.
The canonical 16-domain taxonomy (scopes, artifacts, AI-DLC stage mapping) is defined in
drzero-domain-mapping.md ยง1. The
plugin deliberately ships no reviewer agents โ review tiers are plugin-external and
composed by consumers via quality_gates (see agents/CLAUDE.md).
Skills
Five skills ship with the plugin under skills/, each loadable on demand:
| Skill | Purpose |
|---|---|
| domain-agent-routing | Understand the 16 domain specializations, geometric priority matrix theory, and runtime agent discovery mechanism |
| drzero-curriculum-learning | Understand dual-scoring curriculum learning (HRPO proposer + GRPO solver, arXiv:2601.07055) and the proposer-solver architecture behind Phase 1 |
| pr-ci-monitoring | GitHub PR/CI monitoring strategies, auto-merge workflows, and integration with gh CLI or GitHub MCP server |
| rick-swarm-integration | Orchestrator swarm coordination patterns, Phase 2 work execution, and multi-agent parallel task distribution |
| security-review-protocol | Security review quality gates, the three-headed review system (security, quality, savage), and definition-of-done criteria |
Hook System
Two hooks implement the plugin's fail-closed validation and anti-hallucination scoring. Full contract: docs/plugins/drzero/reference/ref-hooks.md; dr0-side background: dr0/docs/hook-architecture.md.
- PreToolUse โ validates Dr. Zero agent inputs before the Task tool spawns them: required WorkItem fields, scope-boundary defenses (empty/wildcard/absolute/traversal paths rejected), and acceptance-test command whitelisting with shell-metacharacter blocking. Validation failure blocks the invocation (exit 1) before any tokens are spent.
- SubagentStop โ captures proposer/solver output when
a Task completes and computes deterministic scores via
dr0.scoring: HRPO proposer reward (format + difficulty calibration) and GRPO solver reward (binary acceptance-test outcome). Results are written to/tmp/drzero_session.jsonunder file locking, with provenance (scored_by,scored_at). Model-supplied scores are ignored; scoring failures writenull, never a fabricated number.
A standalone validator, hooks/validate-state-file.py,
checks .drzero-state.json before Phase 2 execution.
Hook registration
Hooks are registered via hooks/hooks.json, which is
auto-loaded by Claude Code through the "hooks" key in
.claude-plugin/plugin.json. The lib/setup-dr0.sh step only installs the
dr0 pip package โ hook wiring requires no manual action. After installing the
plugin, run /reload-plugins and confirm the status line reports hooks > 0;
if it reports 0, the scoring pipeline will not run and the Issue #304
anti-hallucination guard will HALT every session.
Loader note: Claude Code resolves
plugin.jsonfrom the marketplace source (~/.claude/plugins/marketplaces/.../), not the per-version cache (~/.claude/plugins/cache/.../). Editing the cache copy will not change/reload-pluginsoutput โ re-sync the marketplace or reinstall the plugin.
Integration with AI-DLC
AI-DLC plans; Dr. Zero executes. The AI-DLC plugin's inception/effort workflow
produces units of work tagged with canonical domain slugs, and /ai-dlc:effort
hands them to Dr. Zero as a domain_routing: YAML payload (per-unit
primary_domain, parallel domains, dependencies, quality_gates,
drzero_review_covers, artifact_paths). The orchestration dispatcher consumes
that payload via a six-step procedure (H1โH6): detect handoff context, parse the
payload, resolve dependency order, dispatch each unit to its domain specialists,
report effort-level completion, and stay idempotent on resume.
On return, the dispatcher emits Reviewer Status updates and โ when a unit claims
review coverage โ a predicate_verdicts: block, closing the loop back into
AI-DLC's effort state. Canonical contracts:
drzero-domain-mapping.md and
drzero-orchestrator-procedure.md;
conceptual overview:
docs/plugins/ai-dlc/explanation/drzero-integration.md.
Agent Override System (3-Level Precedence)
Dr. Zero leverages the Claude Code SDK's native precedence mechanism:
.claude/agents/ (repo) > ~/.claude/agents/ (user) > plugin/agents/ (bundled)
The SDK automatically resolves which agent to use based on filename matching.
How It Works
When Orchestrator routes a WorkItem with domain: "testing", it invokes:
Task(agent="testing") # Just the domain name
The SDK automatically checks:
.claude/agents/testing.md(repo-specific) -- use if exists~/.claude/agents/testing.md(user-wide) -- use if existsplugin/agents/testing.md(bundled) -- fallback
Override Example: Custom Testing Agent
Create .claude/agents/testing.md in your project:
---
name: testing
description: 'Custom test agent for our project-specific validation'
---
# Custom Testing Agent
You are the testing specialist for this project.
## Project-Specific Test Patterns
- Use our custom pytest fixtures in tests/conftest.py
- Follow our test naming convention: test_{feature}_{scenario}
- Always include integration tests for API endpoints
When Dr. Zero routes domain: "testing", your custom agent is used automatically.
Security Features
- Scope boundary validation: The PreToolUse hook rejects path traversal
(
../../../etc/passwd), absolute paths, and CAT attacks (empty or glob/wildcard scopes) before any solver runs - Command whitelisting: Validates
acceptance_testcommands against a whitelist of 28 safe tools (pytest, ruff, terraform plan, etc.), blocks shell metacharacters, and prevents mutating subcommands (e.g.,terraform apply) - Untrusted-input fencing:
/drzerovalidates user input before processing โ mode-specific length limits, prompt-injection pattern blocking, and shell-metacharacter escaping, with the sanitized copy persisted atomically to/tmp/drzero_input_validation.json - Anti-hallucination scoring: All GRPO/HRPO scores are computed by the
dr0Python package via the SubagentStop hook -- agents never self-report scores, model-supplied scores are discarded, and scoring failures writenullwith error provenance (see Issue #304) - Trusted-path installer:
lib/setup-dr0.shonly executes the dr0 installer from a trusted checkout ($OTC_AWESOME_LLM_ROOTor plugin-relative) โ never from the current working directory, so an untrusted repo cannot ship a crafted installer (see lib/README.md) - Security review protocol: The security-review-protocol skill defines the three-headed review gates (security, quality, savage) and definition-of-done criteria applied by the
securityagent - Cross-repo remediation bridge: The
security-remediationagent turns secplat findings into coordinated portal-gun remediation across affected repositories
Requirements
- Python 3.11+
- Git repository
- Claude Code CLI
dr0Python package (provides GRPO/HRPO scoring math)- Local CI tools: pytest, ruff, mypy, bandit
Troubleshooting
"Proposer agent not found"
Ensure agents are in one of the three discovery paths:
.claude/agents/proposer.md(project-level override)~/.claude/agents/proposer.md(user-level override)plugin/agents/proposer.md(bundled fallback)
"Phase 1 not converging"
Adjust target success rate in drzero.yml:
dr_zero:
target_success_rate: 0.4 # Lower threshold (default: 0.5)
max_iterations: 5 # More iterations
"Orchestrator not found"
Falls back to sequential execution without Orchestrator. To use Orchestrator:
- Check
/helpfor available agents - Verify the
orchestration.mdagent exists in a discovery path - Confirm
name: orchestrationin the agent frontmatter
Agent Override Not Working
If your custom agent is not being picked up:
-
Filename must match the domain exactly:
~/.claude/agents/testing.md -- correct for "testing" domain ~/.claude/agents/my-tester.md -- wrong, SDK will not find this -
Frontmatter
namemust match the filename (without.md):--- name: testing --- -
Check the /agents command to confirm which source wins:
/agents # testing (project) [overrides plugin] # testing (user) [overrides plugin] # testing (plugin) [bundled default] -
Verify precedence order:
- Project (
.claude/agents/) overrides user (~/.claude/agents/) - User overrides plugin (bundled agents)
- If agent appears in multiple locations, highest precedence wins
- Project (
Common mistakes:
- Agent file not named after the domain (e.g.,
koji.mdinstead oftesting.md) - Frontmatter
namedoes not match filename - Directory typo:
.claude/agent/(missing trailings)
Handling Conflicts (Swarm Agents Modifying Same Files)
When multiple agents modify overlapping files in Phase 2:
Resolution strategies:
-
Sequential execution (safest):
# drzero.yml agent_swarm: max_parallel: 1 -
File-based locking (automatic): Orchestrator automatically sequences agents working on the same files.
-
Manual resolution:
git status # edit conflicted files git add <resolved-file> /drzero:drzero --resume
Prevention:
- Use smaller, focused WorkItems (fewer file overlaps)
- Prefer domain specialists with clear boundaries
- Use
/drzero:drzero-councilfor architecture decisions before implementation
Rolling Back Changes
Dr. Zero creates git stashes before each phase:
git stash list | grep drzero
Rollback Phase 2 only (keep Phase 1 refinements):
git reset --hard HEAD~1
git stash pop stash@{0}
Full session revert:
git reset --hard <commit-before-drzero>
git stash clear
Checkpoint configuration:
# drzero.yml
dr_zero:
checkpoints:
enabled: true
frequency: per-phase # or: per-iteration, per-task
auto_stash: true
SDK Precedence Not Working As Expected
# Verify which agent source wins
/agents testing
# Shows: testing (plugin) OR testing (user) OR testing (project)
# Check Claude Code version
claude --version
# Verify plugin installation
claude plugin list
If precedence is still broken, file an issue with:
- Claude Code version
- Agent file locations and their frontmatter
/agentscommand output- Dr. Zero session logs
Documentation
User documentation follows the Diataxis quadrants under docs/plugins/drzero/:
- Tutorial: Your first autonomous session
- How-to: Create a custom domain agent
- Reference: Command reference ยท Hook reference
- Explanation: Domain routing ยท HRPO/GRPO curriculum learning
Plugin and dr0 internals:
- Paper Alignment: references/drzero-paper-2601.07055.md (arXiv:2601.07055)
- Scoring runtime: dr0/README.md (canonical
dr0.scoringsurface) - Architecture: dr0/docs/architecture.md and dr0/docs/architecture-diagrams.md
- Configuration Schema: dr0/docs/configuration-schema.md
- Hook internals: docs/drzero-hooks-architecture.md
Notes
- All agent invocations use the Task tool (no subprocess calls)
- Context is preserved throughout both phases
- HRPO scores the proposer; GRPO scores the solver -- never the reverse (arXiv:2601.07055 Figure 2)
- Scores are computed by the
dr0Python package, never self-reported - Session state:
${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json(per-session, transient) and.drzero-state.json(persistent) - Checkpoints use git stash for safe rollback
Contributing
See the main repository CONTRIBUTING.md and CLAUDE.md.
License
Internal Use Only - Optum Tech Compute

