Skip to content
XpioHealth

Case Study

Migrating 18 Million Data Points from a Legacy EHR

How Xpio mapped a legacy ECHO database, built a complete virtual health record, and safely extracted every patient record for HopeSparks Family Services during their EHR transition.

Patient records recovered
~6,000
Data fields per patient
280+
Source tables mapped
50+
Total data points extracted
18M+

The Challenge

Complex migration. 6,000 patients. Zero data loss required.

01

Legacy ECHO/Revman EHR database (SQL Server Express) requiring custom extraction tooling for a comprehensive data migration

02

Client transitioning to a new EHR, needed complete patient data extracted before decommissioning the old system

03

50+ database tables spanning clinical, billing, and administrative domains, requiring comprehensive schema mapping and relationship resolution

04

Scanned documents and digital signatures stored in specialized formats requiring custom extraction tooling

05

Clinical assessments (PHQ-9, GAD-7, CPSS) embedded across wide tables with 150+ columns requiring detailed field mapping

What Xpio Built

A complete extraction system, from raw SQL to AI-powered patient summaries.

Complete Schema Mapping

Xpio mapped the entire ECHO database schema, 50+ tables, thousands of columns, complex foreign key relationships. We built a complete data dictionary, verified against real patient data across every table.

25-Tab Virtual Health Record

Built a complete virtual health record replicating every ECHO tab: clinical summary, demographics, diagnoses, medications, assessments, progress notes, care plans, billing, signatures, and scanned documents. 100% tab coverage verified.

AI-Powered Clinical Summaries

Claude API generates patient summaries from the extracted structured data, treatment history, assessment trends, medication timelines, and care plan status. Intelligent caching reduced AI response times from 24 seconds to under 50ms.

Intelligent Caching & Performance

Hash-based cache versioning tied to patient data modifications. Cache-first architecture reduced query times from 30+ seconds to under 50ms. 95%+ cache hit rate with automatic invalidation when source data changes.

Data Quality & PHI Protection

Automated data quality scoring per patient, field completeness tracking, validation, and issue identification. SSN masking, date standardization, and empty field filtering. Every record cleaned and verified before export.

Document & Signature Recovery

Recovered scanned clinical documents and digital signatures from proprietary ECHO storage (cd.images linked through episode chains). Binary formats decoded and exported as standard files with full audit trail.

Data Coverage

Every clinical domain. Every record.

Clinical Assessments

PHQ-9, GAD-7, CPSS, intake forms

Progress Notes

Session notes, cancellation records

Diagnoses

ICD-10 codes with date ranges

Medications

Prescriptions, dosages, providers

Billing & Financial

Claims, transactions, receipts

Episodes & Enrollments

Admission, discharge, program data

Care Plans

Goals, objectives, interventions

Demographics

Identity, address, communications

Signatures

Digital authorizations and consents

Documents

Scanned images, clinical attachments

Results

Zero data loss. Complete migration.

~6,000

Patient records fully extracted with complete data integrity verification

280+

Data fields per patient mapped across 50+ legacy tables

100%

ECHO tab coverage, all 25 tabs replicated in the virtual health record

50ms

Query response time with intelligent caching (down from 30+ seconds)

85/100

Average data quality score with automated field completeness tracking

Zero

Data loss, every record verified against source before decommission

Under the Hood

How the extraction pipeline works.

Data Processing Pipeline

1Raw extraction from SQL Server Express (ECHO/Revman schema)
2Data quality helper, trimming, standardization, validation
3Field mapping across 50+ tables with relationship resolution
4Master JSON generation per patient (280+ fields)
5Quality scoring and completeness verification
6AI-powered clinical summaries via Claude API with caching
7Virtual health record with full 25-tab ECHO fidelity

AI Cache Architecture

Every patient summary is generated once, then cached with hash-based versioning tied to the underlying data. When patient data changes, the cache auto-invalidates and regenerates on next access. Cost tracking shows the exact savings per request, from ~$0.06 per AI call to $0.00 for cache hits. 95%+ hit rate in production, with response times dropping from 24 seconds to under 50 milliseconds.

Stuck on a legacy EHR?

ECHO, Revman, Credible, or any legacy system, Xpio has the tools and the behavioral health expertise to get your data out safely. No data loss, no shortcuts.

Talk to Our Migration Team