Clinical RAG & SQL Agent

An AI assistant for clinical data with a security-first architecture. The LLM never accesses patient data directly—it can only request data through validated tool interfaces.

Security Model

Traditional SQL agents are dangerous:
- LLM can construct arbitrary queries
- No input validation
- Raw patient data in context

Our approach: Parameterized tools with strict validation.

# Bad: Direct SQL access
agent.run("SELECT * FROM patients")  # Security hole!

# Good: Validated tool interface
tools.get_cohort_stats(diagnosis_keyword="cirrhosis")  # Returns aggregates only

Tool Architecture

The agent has access to specialized tools:

Tool	Purpose	Returns
`get_cohort_stats`	Population overview	Counts, averages
`count_by_diagnosis`	Disease prevalence	Aggregate counts
`get_lab_distribution`	Lab value ranges	Min/max/mean/std
`compare_cohorts`	Group comparisons	Statistical summaries
`search_clinical_notes`	RAG over notes	Relevant excerpts

Validation Layer

All inputs pass through validation:

def validate_patient_id(patient_id: str) -> bool:
    return bool(re.match(r'^P-\d{4}$', patient_id))

def validate_diagnosis(keyword: str) -> bool:
    return keyword.lower() in ALLOWED_DIAGNOSES

Live Demo

Try the agent at Clinical RAG Agent Demo. Ask questions like:
- "How many patients have cirrhosis?"
- "Compare lab values between hepatitis and HCC patients"
- "What's the average AFP in elevated cases?"