Optum Responsible AI (RAI) compliance
Responsible AI compliance requirements for Optum AI/ML development, covering AIRB submission, shadow mode pilots, RAI risk tiers, and governance processes.
Optum Responsible AI (RAI) Compliance
This instruction file provides guidance for developing AI/ML solutions that comply with Optum's Responsible AI (RAI) framework and AIRB (AI Review Board) requirements.
Overview
All AI/ML applications at Optum must follow the RAI Development Guide v3.0 and obtain AIRB approval before production deployment. This includes:
- LLM-based applications (chatbots, copilots, agents)
- Traditional ML models (classification, regression, clustering)
- Decision support systems with AI components
- Automated data processing with AI/ML
RAI Risk Tiers
Tier 1: Low Risk
- Definition: No direct impact on member care or business decisions
- Examples: Internal productivity tools, code generation assistants, documentation generators
- Review: Lightweight AIRB review, self-certification allowed
- Timeline: 1-2 weeks
Tier 2: Medium Risk
- Definition: Indirect impact on operations or member experience
- Examples: Care coordination assistants, claims processing aids, provider search enhancements
- Review: Standard AIRB review with testing requirements
- Timeline: 4-6 weeks
Tier 3: High Risk
- Definition: Direct impact on member health, coverage, or financial decisions
- Examples: Clinical decision support, coverage determination, fraud detection
- Review: Full AIRB review with ongoing monitoring
- Timeline: 8-12 weeks
AIRB Submission Process
1. Pre-submission checklist
Before submitting to AIRB:
# Include this metadata in your project
airb_submission:
risk_tier: 'tier-2' # tier-1 | tier-2 | tier-3
use_case: 'Terraform plan risk analysis for infrastructure changes'
data_types:
- configuration_files
- infrastructure_state
phi_handling: 'none' # none | de-identified | full
pii_risk: 'low' # low | medium | high
decision_type: 'advisory' # advisory | automated | human-in-loop
shadow_mode_eligible: true
2. Required documentation
Technical documentation:
- System architecture diagram
- Data flow diagram (inputs, processing, outputs)
- Model card (for ML models) or LLM specification
- Integration points and dependencies
- Failure modes and fallback mechanisms
Governance documentation:
- Privacy Impact Assessment (PIA)
- Bias and fairness analysis
- Transparency and explainability plan
- Monitoring and alerting strategy
- Incident response plan
3. Submit via UAIS
# Submit through United AI Studio portal
https://app.unitedaistudio.uhg.com/projects
# Or via API (if available)
curl -X POST https://api.unitedaistudio.uhg.com/airb/submit \
-H "Authorization: Bearer $UAIS_TOKEN" \
-d @airb-submission.json
4. AIRB review timeline
| Phase | Duration | Activities |
|---|---|---|
| Intake | 1-2 days | Initial review, risk tier confirmation |
| Technical review | 1-3 weeks | Architecture, security, privacy assessment |
| Bias & fairness | 1-2 weeks | Fairness testing, bias mitigation review |
| Final approval | 3-5 days | Executive review, decision |
Shadow Mode Pilot
For Tier 2 and Tier 3 applications, run a shadow mode pilot before full deployment:
Shadow mode requirements
# Shadow mode implementation pattern
class AIAssistant:
def __init__(self, shadow_mode: bool = False):
self.shadow_mode = shadow_mode
self.logger = get_airb_logger()
def make_recommendation(self, input_data):
# Generate AI recommendation
ai_result = self.model.predict(input_data)
if self.shadow_mode:
# Log recommendation but don't use it
self.logger.log_shadow_prediction(
input=input_data,
prediction=ai_result,
timestamp=datetime.utcnow()
)
# Return None or default value
return None
else:
# Use AI recommendation in production
return ai_result
Shadow mode duration
- Tier 2: 30 days minimum, 1000+ predictions
- Tier 3: 90 days minimum, 5000+ predictions
Success criteria
- Accuracy within 5% of baseline
- No bias detected in protected groups
- Explainability score > 0.7
- Incident count = 0
Bias Detection and Mitigation
Protected attributes
Monitor fairness across these dimensions:
PROTECTED_ATTRIBUTES = [
"age",
"gender",
"race",
"ethnicity",
"disability_status",
"language",
"geography", # Rural vs urban
"socioeconomic_status"
]
Fairness metrics
from optum.rai import FairnessAnalyzer
analyzer = FairnessAnalyzer(
model=my_model,
protected_attributes=PROTECTED_ATTRIBUTES
)
# Calculate fairness metrics
results = analyzer.analyze(test_data)
# Must meet these thresholds
assert results.demographic_parity < 0.1 # <10% disparity
assert results.equal_opportunity > 0.9 # >90% equal opportunity
assert results.disparate_impact > 0.8 # >80% DI ratio
Mitigation strategies
- Pre-processing: Reweight training data to balance protected groups
- In-processing: Use fairness constraints during training
- Post-processing: Adjust decision thresholds per group
- Human-in-loop: Require human review for borderline cases
Privacy and Security
Data handling rules
# Do NOT log or store
PROHIBITED_DATA = [
"member_name",
"social_security_number",
"date_of_birth",
"address",
"phone_number",
"email",
"medical_record_number"
]
# De-identify before logging
def log_inference(input_data, output_data):
deidentified = deidentify(input_data, PROHIBITED_DATA)
logger.info(f"Inference: {deidentified} -> {output_data}")
Encryption requirements
- At rest: All model artifacts and training data must be encrypted (AES-256)
- In transit: TLS 1.3 for all API calls
- In memory: Use secure enclaves for sensitive inference
Access control
# Role-based access control
rbac:
model_developer:
- read_training_data
- write_model
- deploy_to_dev
data_scientist:
- read_training_data
- write_model
production_deployer:
- deploy_to_prod
- manage_monitoring
auditor:
- read_logs
- read_metrics
Transparency and Explainability
Model cards
Every model must have a model card:
# Model Card: Terraform Risk Analyzer
## Model Details
- **Model type**: GPT-4-based risk analysis agent
- **Training data**: 10,000 anonymized Terraform plans
- **Version**: 1.2.0
- **Last updated**: 2025-12-11
## Intended Use
- **Primary use**: Identify high-risk changes in Terraform plans
- **Out of scope**: Automated approval/rejection of changes
- **Target users**: Platform engineers, SREs
## Performance
- **Accuracy**: 94% on validation set
- **Precision**: 92%
- **Recall**: 91%
- **F1 Score**: 0.915
## Fairness
- No protected attributes in scope
- Geographic analysis: No significant regional bias
## Limitations
- May miss novel attack patterns not in training data
- Requires human review for high-risk changes
- Sensitive to input formatting
Explainability in code
# Provide explanations for all AI decisions
def explain_decision(input_data, model_output):
"""
Generate human-readable explanation for AI decision.
Required for AIRB compliance.
"""
# Use SHAP, LIME, or attention weights
explanation = model.explain(input_data)
return {
"decision": model_output,
"confidence": explanation.confidence,
"top_factors": explanation.top_factors[:5],
"counterfactuals": explanation.counterfactuals,
"human_readable": f"Decision based on {explanation.summary}"
}
Monitoring and Alerting
Required metrics
# Monitor these metrics in production
REQUIRED_METRICS = {
"inference_latency_p95": 500, # ms
"error_rate": 0.01, # 1%
"bias_drift": 0.05, # 5% max drift
"accuracy_drift": 0.05, # 5% max drift
"explainability_score": 0.7 # >70%
}
# Alert thresholds
ALERT_THRESHOLDS = {
"critical": {
"error_rate": 0.05, # 5%
"bias_drift": 0.10, # 10%
},
"warning": {
"inference_latency_p95": 1000, # ms
"accuracy_drift": 0.08, # 8%
}
}
Incident response
When RAI violations are detected:
- Immediate: Trigger kill switch via Agent Gateway
- Within 1 hour: Notify AIRB and product owner
- Within 4 hours: Root cause analysis
- Within 24 hours: Remediation plan
- Within 1 week: Post-mortem and prevention measures
Code Examples
Compliant LLM application
from optum.rai import RAIFramework
from optum.monitoring import AIMonitor
class CompliantLLMApp:
def __init__(self):
self.rai = RAIFramework(
risk_tier="tier-2",
airb_ticket="AIRB-2025-1234"
)
self.monitor = AIMonitor(app_name="terraform-assistant")
def process_request(self, user_input):
# 1. Validate input
if not self.rai.validate_input(user_input):
return {"error": "Invalid input"}
# 2. Check for PII/PHI
if self.rai.contains_sensitive_data(user_input):
return {"error": "Sensitive data detected"}
# 3. Generate response
response = self.llm.generate(user_input)
# 4. Explain decision
explanation = self.explain(user_input, response)
# 5. Log for monitoring
self.monitor.log_inference(
input=self.rai.deidentify(user_input),
output=response,
explanation=explanation
)
# 6. Return with explanation
return {
"response": response,
"explanation": explanation,
"confidence": explanation.confidence
}
Testing for bias
import pytest
from optum.rai.testing import BiasTestSuite
class TestModelFairness:
def test_demographic_parity(self):
"""Ensure model treats all demographic groups fairly."""
suite = BiasTestSuite(model=my_model)
results = suite.test_demographic_parity(
test_data,
protected_attr="age"
)
assert results.disparity < 0.1
def test_equal_opportunity(self):
"""Ensure equal true positive rates across groups."""
suite = BiasTestSuite(model=my_model)
results = suite.test_equal_opportunity(
test_data,
protected_attr="gender"
)
assert results.tpr_disparity < 0.05
Resources
Internal documentation
- RAI Development Guide v3.0:
V3_0 RAI Development Guide_published.pdf - AIRB process: https://docs.hcp.uhg.com/united-ai-studio/submit-a-review-to-the-mlrb
- RAI SharePoint: https://uhgazure.sharepoint.com/sites/ResponsibleUseofAI
Training and support
- RAI Office Hours: Weekly, see SharePoint for schedule
- UAIS Support: https://docs.hcp.uhg.com/united-ai-studio/support-faq
- Generative AI CoE: https://genaicoe.goto.optum.com
Tools and libraries
- Optum RAI SDK:
pip install optum-rai - Fairness toolkit: https://github.com/optum-labs/fairness-toolkit
- Model card generator: https://github.com/optum-labs/model-cards
Compliance checklist
Before deploying to production:
- AIRB ticket created and approved
- Risk tier assessment completed
- PIA (Privacy Impact Assessment) submitted
- Bias and fairness testing completed
- Model card published
- Monitoring and alerting configured
- Incident response plan documented
- Shadow mode pilot completed (if required)
- Security review passed
- Kill switch integrated via Agent Gateway
- Documentation published to UAIS
- Training provided to end users
Version history
- v3.0 (2025-12-11): Initial instruction file based on RAI Development Guide v3.0
Related Assets
AIRB Risk Assessment (Optum)
Perform a comprehensive risk assessment for AI/LLM systems to determine AIRB tier classification and required governance controls.
Owner: epic-platform-sre
AIRB Submission Prep (Optum)
Prepare a complete AIRB submission package and checklist for a UAIS/LLM project following RAI Development Guide v3.0 requirements.
Owner: epic-platform-sre
Bias and Fairness Test Analyzer (Optum)
Analyze bias/fairness test results and propose mitigations aligned with Optum RAI guidance for AIRB submission.
Owner: epic-platform-sre
Shadow Mode Pilot Planner (Optum)
Design a comprehensive shadow mode pilot plan for Tier 2/3 Optum AI/LLM systems with success criteria, monitoring, and go/no-go gates.
Owner: epic-platform-sre
Create AGENTS.md
Create an AGENTS.md file for the current repository with secure and compliant Optum guidance.
Owner: platform-devops
Optum Harmony Healthcare Demo App
Create a Harmony-based example healthcare application that showcases eligibility, claims, and remittance concepts using current Harmony skills, instructions, navigation, forms, and components.
Owner: harmony-platform

