Versioned Patient Repository (VPR)
A patient-owned record with professionally curated truth.
Welcome to the Versioned Patient Repository (VPR) documentation!
The VPR uses files rather than traditional databases to store patient data. On top of this, version control via Git is used to track changes to patient records over time, providing a robust and auditable history of all modifications.
Please view the VPR overview if you would like to read a detailed overview of the reasoning behind the design and build of the VPR.
Overview
Introduction
In today’s healthcare landscape, Electronic Patient Records (EPRs) are traditionally stored in centralised databases that serve the needs of organisations – such as hospitals, GP practices, and social care settings. While this approach is familiar and widely adopted, it can make it harder for patients to own their own data, directly access their data, understand it, and flag errors.
There are areas of improvement in this space – for example, patients in the UK can access parts of their record through the NHS App in the UK (NHS 2015) or local patient portals. However, these systems are still organisation-owned, lacking interoperability, fragmented, and often limited in scope.
The Versioned Patient Repository (VPR) introduces a shift in this paradigm by placing patients at the centre of their health and care record.
The VPR is a file-based health record architecture where each patient’s data is stored as structured, human-readable documents. Instead of overwriting records, each change creates a new version, managed through Git-like version control. This produces an immutable audit trail while maintaining portability and interoperability.
At the heart of VPR is a combined keystone principle: the patient comes first, and the canonical record is kept as human-readable files. Every design choice should reinforce patient agency while preserving an auditable, legible, file-based record that patients and clinicians alike can inspect and carry with them.
Patients first
When treating a patient, we put the patient at the heart of every decision. Their needs, preferences, and rights guide our actions. The same patient-first principle should extend to health data. Current EPR implementations, however, are built around organisational needs rather than those of the individual. We need to step back and reimagine the health record from the patient’s perspective. In fact, we need to make the patient’s data portable and accessible wherever they go. This is where the file shows its strength.
The VPR is a file-based data storage structure. To ensure data integrity and traceability, data entered into the VPR record is stored and signed off via the use of version control. Git is used as the underlying technology to manage versioning along with cryptography.
Using VPR, the patient holds a complete, versioned, and portable record that reflects their health and care journey across settings. Instead of organisations needing to broker complex integrations, the VPR offers a single, patient-held data layer – a consistent source of truth available wherever care is delivered.
Benefits of the VPR design
Placing the patient first unlocks multiple benefits. Wherever a patient can go, their record should follow – and the VPR makes this possible by using standard data structures to support interoperability by default.
From a safety perspective, patients can more easily spot errors or inconsistencies, adding an extra layer of assurance. From a financial and operational standpoint, the lightweight, open-source model of the VPR reduces infrastructure burden and supports cost-effective deployment in both small and large settings.
Technical Details
The Versioned Patient Repository (VPR) is built using a modular, open-source architecture that combines the reliability of file-based storage with the assurance of cryptographic version control. Each patient’s record consists of structured files that are stored and tracked in a Git-based system, ensuring traceability and data integrity. Instead of overwriting data, each change creates a new version that can be reviewed, audited, or rolled back if required.
The VPR is written in Rust, a systems programming language known for its safety, speed, and memory efficiency. The codebase is organised as a collection of independent Rust crates with clearly defined interfaces. This modular approach allows developers to adapt, extend, or replace components without altering the overall structure. For instance, separate crates can handle data storage, versioning logic, cryptographic signing, and API delivery.
The system supports multiple build configurations, enabling the same core codebase to serve both patient-facing and organisation-facing use cases. Compile-time flags determine which functionality is included in each build. A patient-side build contains only the features needed to view and manage one’s own record, keeping it lightweight and secure. An organisation-side build may include additional modules for managing multiple patients, enforcing access controls, and supporting integration with other clinical systems. This approach ensures that both variants remain aligned to the same specification while optimising performance for their respective roles.
Deployment is flexible. The VPR can be embedded within standalone desktop or mobile applications, distributed as an encrypted patient-held package, or hosted on secure institutional servers. Because data is file-based rather than database-bound, deployment does not rely on heavy infrastructure or proprietary database engines. Each record remains portable and can be reconstructed on any compatible instance of the system.
While files act as the canonical data source, efficient access for clinicians and applications requires high-performance querying. To support this, the VPR introduces database-based projections: pre-computed views of the file data that are optimised for specific operations such as patient summaries, correspondence lists, or message threads. Projections can be refreshed automatically whenever a new commit is made, or generated on demand for less frequently accessed data. This design provides the responsiveness of a traditional database while retaining the transparency and auditability of file storage.
Security is embedded at every layer. All files are cryptographically signed and checksummed to prevent tampering. Access is controlled through authenticated APIs, and sensitive data can be encrypted both at rest and in transit. The combination of version control, immutable history, and cryptographic verification ensures that every change is attributable and recoverable, which is essential for clinical safety and regulatory compliance.
In summary, the VPR merges the rigour of modern software engineering with the principles of safe clinical record-keeping. By treating files as the canonical source and databases as transient projections, it achieves both transparency and speed. The result is a system that is secure, flexible, and designed to evolve alongside the healthcare organisations and patients it serves.
Files as canonical, projections for performance, patient as the atomic unit.
Data Structure and Standards
The underlying data format follows openEHR models for clinical content and FHIR standards for demographics and coordination data. Files are stored as markdown and JSON-compatible structures. These act like non-relational documents – self-contained, structured, and readable – making them easy to process in a wide range of applications.
Patient data is organised into three separate repositories:
- Clinical repository: openEHR-based clinical content (observations, diagnoses, clinical letters)
- Demographics repository: FHIR-based patient demographics (name, date of birth, identifiers)
- Coordination repository (Care Coordination Repository): care coordination data (encounters, appointments, episodes, referrals) – format to be determined, may adopt FHIR ideologies
This structure recreates the layered design of openEHR – a clear distinction between data content, clinical models, and terminology – while adding administrative coordination as a separate concern. None of these require a centralised relational database.
Versioning and Audit Trail
Every change to the VPR is committed using Git. Nothing is deleted or lost – a full cryptographic audit trail is preserved. This immutability is fundamental to patient safety, clinical governance, and legal compliance.
The Four Commit Actions
VPR uses a controlled vocabulary for all changes:
- Create: Adding new content (new letter, observation, diagnosis, or record initialization)
- Update: Modifying existing content (corrections, amendments, demographic updates)
- Superseded: When newer clinical information replaces previous content (revised diagnoses, updated care plans)
- Redact: The only action that removes data from view - used when data is entered into the wrong patient’s repository (clinical, demographics, or coordination)
How Redaction Works
When data is mistakenly entered into the wrong patient’s repository:
- The data is removed from the active view
- It is encrypted and stored securely in the Redaction Retention Repository
- A non-human-readable tombstone/pointer remains in the original Git history
- The commit records the redaction action with full audit metadata
This process maintains complete traceability without exposing sensitive data, ensuring both patient privacy and audit compliance.
What VPR Never Does
VPR never deletes data from version control history. Even redacted data is moved to secure storage rather than destroyed. This guarantees:
- Patient Safety: All changes are traceable to specific authors at specific times
- Legal Compliance: Complete audit trail meets regulatory requirements
- Clinical Governance: Full accountability for all modifications
- Research Value: Historical data remains available for authorized use
This immutability ensures auditability, safety, and trust – even in the face of human error.
Export and Portability
Patients can download their patient record as a bundle of files - on a USB stick, as an encrypted archive, or even loaded into a standalone reader app. These files remain functional offline and can be interpreted by lightweight applications without needing a local database engine. This simplicity ensures the records remain portable, long-lived, and system-agnostic.
Natural Progression
The VPR is the natural progression of the patient record, starting with the work of Dr Lawrence Weed in the 1960s.
There are residents and staff-men who maintain that the content of their records is their own business. In reality, however, it is the patient’s business and the business of those who, in the future, will have to depend on that record for the patient’s care, or for medical research (Weed 1964).
Lawrence Weed’s Problem-Oriented Medical Record (POMR) reframed medical documentation around the patient’s problems, rather than the clinician’s specialty or the hospital’s structure. His approach established a patient-centred logic for clinical reasoning, in which each problem linked observations, assessments, and plans in a transparent and auditable way.
Building on Dr Weed’s foundation, openEHR formalised Weed’s ideas into a computable data model. Its archetypes and templates capture the clinician’s reasoning processes and the structure of clinical encounters, allowing problem-oriented documentation to be represented in interoperable, machine-readable form.
The VPR extends the above principles further. The VPR provides a longitudinal, versioned record that preserves data integrity across institutions and regions, giving both patients and clinicians access to a single evolving source of truth. Where Weed’s POMR unified thought, and openEHR unified meaning, the VPR unifies time and ownership.
References
NHS (2025). ‘Personal health records’. Available at: https://www.nhs.uk/nhs-app/nhs-app-help-and-support/health-records-in-the-nhs-app/personal-health-records/ (Accessed: 5 Nov. 2025).
WEED, L.L. (1964). ‘MEDICAL RECORDS, PATIENT CARE, AND MEDICAL EDUCATION’, It. J. Med. Sc., 462, pp. 271-82.
Literature Review
The following review outlines the key developments, standards, and technologies that inform the design of the Versioned Patient Repository (VPR), particularly with regard to data storage, patient ownership, and open-source architecture.
Data Storage Models for patient records
Database-centric models
Most traditional EPR systems are built using centralised relational databases. These models are well-established in clinical informatics and can scale effectively within single organisations. However, they often pose significant challenges when records need to move between systems, and are typically tightly coupled to the organisation’s software stack.
File-based and version-controlled models
In contrast to centralised databases, file-based systems allow for portability, transparency, and simplified version control. Several notable efforts have explored this approach. We will explore them here.
Burstein (2020a & 2020b) describes a proof-of-concept system for medical record-keeping based entirely on plain-text files and Git, developed for rural health centres in Rwanda where internet connectivity is unreliable. Instead of using a traditional database, the system stores patient data in human-readable YAML files and uses Git to manage version control, replication, and audit trails. This architecture prioritises offline resilience, transparency, and long-term accessibility, avoiding vendor lock-in and enabling data portability across devices. While not suitable for all settings, the project demonstrates that file-based, version-controlled health records can meet real clinical needs, especially in environments where simplicity, traceability, and decentralisation are key.
Adams (2020a & 2020b) presents a lightweight system called Hugo Clinic Notes, designed for smaller clinics and written in Markdown. The tool organises patient notes by name, date, and appointment time, supports multiple note types (such as assessments and follow-ups with embedded media), and includes a printable view so records can be easily saved or shared. While patient data itself is not version controlled, Git is used to manage the form templates and archetypes, allowing clinical structures to evolve safely over time. Notes are edited manually as Markdown files outside the system, and Hugo is then used to regenerate the site as a set of static HTML pages. Emphasising portability, simplicity, and clinician or patient control over data location, the project demonstrates how static site generation and file-based structures can support clinical documentation when traditional EPR systems may be unnecessarily complex. Although primarily used and maintained by its creator, it remains a useful example of how low-dependency, open tooling can be adapted for healthcare use.
Wack et al. (2025) describe the gitOmmix approach for clinical omics data, which integrates version‑control systems (specifically Git and git‑annex) with provenance knowledge‑graphs (based on PROV‑O) to enhance clinical data warehouses. The authors argue that traditional CDWs (clinical data warehouses) lack robust support for large data files and longitudinal provenance tracking. In response, gitOmmix uses Git to version and track large files (via git‑annex) and aligns version history with a provenance graph so that each data analysis, decision, and patient sample can be traced back comprehensively. The system supports querying the relationships between raw files, analyses, and clinical outcomes by combining versioning metadata and provenance semantics. Although the work is tailored particularly to omics (genomics, pathology, radiology) rather than general EPRs, it provides a compelling file‑based, version‑aware model for health‑data systems and thus offers a useful precedent for the VPR’s versioned and patient‑centric architecture.
Blockchain
Reen et al. (2019) propose a decentralised e‑health record management system that combines blockchain technology with the InterPlanetary File System (IPFS) to give patients control of their health‑data flows. The architecture stores encrypted patient records on IPFS and uses smart contracts on a blockchain to manage access authorisations, thereby enabling patient‑centric sharing, auditability and privacy. While the system emphasises distributed storage and peer‑to‑peer data exchange rather than a central database, the authors note trade‑offs in terms of scalability and the maturity of supporting infrastructure. The work provides an instructive example of how versioning, audit trails and patient‑owned data constructs can be applied in health‑care settings and hence offers relevant insight for the design of the VPR.
Shi et al. (2020) conduct a systematic literature review of blockchain applications in electronic health‑record (EHR) systems, specifically assessing how such architectures tackle security and privacy challenges. They identify that while blockchain introduces transparency, immutability and decentralised control, its implementation in healthcare faces major hurdles in scalability, interoperability, and compliance with regulatory requirements. The study thereby underscores both the promise and the limitations of distributed‑ledger approaches for patient data management and highlights the viability of hybrid or alternate version‑controlled architectures — making it a relevant reference point when considering the design of the VPR.
Antwi et al. (2021) explore how Hyperledger Fabric, a private blockchain system, could be used to manage electronic health records securely. They set up a series of test cases that mimic real clinical use, including patient and clinician access permissions, data privacy controls and how different types of files such as X-rays are handled. The study found that Hyperledger Fabric worked well for keeping data confidential and traceable, but struggled with large-scale storage and the legal requirement to delete data completely. The authors suggest that while it is not a perfect solution, private blockchains like Fabric could form part of future systems that let patients control access to their records while maintaining a strong audit trail.
Kumari et al. (2024) describe HealthRec-Chain, a system designed to give patients greater control over their health data while keeping it secure and shareable. The approach combines two technologies: blockchain, to record who accesses information, and IPFS, a distributed file system used to store the medical files themselves. Each record is automatically encrypted before being stored, and patients can grant or remove access through simple permissions. The authors test the system’s performance and find that this hybrid model could offer a practical balance between security, transparency, and scalability—avoiding some of the heavy costs of traditional blockchain-only designs.
Patient focused systems
Fasten-OnPrem, is an open-source, self-hosted application for personal or family electronic medical records that aims to bring disparate data from many clinics, labs, and insurers into one place under the individual’s control. The record system was built by Kulantuga and Szilagyi (2025) and sponsored by Fasten Health. Fasten-OnPrem supports key standards such as FHIR and OAuth2 so users can link their existing records rather than manually scanning everything. The system is designed for non-clinical settings (families rather than hospitals), but demonstrates how file-based, patient-owned aggregation of health records can work in practice—emphasising portability, transparency and user control rather than heavy institutional infrastructure.
Healthcare Data Standards
This section surveys structural and exchange standards commonly used to represent and move electronic patient records.
Structural models
openEHR
A specification for modelling clinical content using archetypes and templates. It separates clinical knowledge from data persistence and can be serialised as JSON or XML. It includes constructs for composition, context and audit history.
HL7 CDA
A document-centric model for clinical correspondence and reports. CDA defines a structured container with narrative text and coded elements, typically exchanged as XML.
Exchange and APIs
HL7 v2
A widely deployed messaging standard for admissions, transfers, results and orders. It is compact and event-driven, and remains prevalent in secondary care integrations.
HL7 FHIR
A resource-based standard designed for web APIs. It serialises naturally to JSON or XML, supports profiles to constrain use, and provides resources for provenance, consent and audit. FHIR is now the dominant choice for modern interfaces and patient-facing apps.
IHE Profiles
Integration profiles such as XDS and MHD specify how documents and resources are published, discovered and retrieved across organisations, building on CDA and FHIR.
Open Source in Healthcare
The literature describes open source as a credible approach in digital health when paired with clear governance and resourcing. Reported benefits include transparency of code and data models, which supports independent security assessment, clinical safety review and reproducibility. Reuse is a second theme: open components can be adapted to local workflows, shortening time to deliver standard capabilities such as FHIR APIs, document rendering and integration gateways. Several studies note that open interfaces create incentives for interoperability by lowering switching costs across vendors and sites. Cost is presented more cautiously. Licence fees may fall, particularly for infrastructure, yet staffing, integration and long-term support remain material and require realistic budgets.
Sustainability depends on governance. Successful programmes set explicit licensing strategies, contribution guidelines and release cadences, and treat clinical safety artefacts as first-class versioned assets alongside source code. Open projects do not remove the need for security engineering. Threat modelling, coordinated vulnerability disclosure, continuous testing and dependency management are still required, and are often easier to scrutinise when build pipelines are public.
Case studies illustrate these points in practice. OpenEyes shows that a specialty EPR can be developed in the open and operated across several NHS Trusts with formal safety processes. OpenSAFELY demonstrates that transparent code and specifications can coexist with strict controls on patient data, enabling reproducible analytics at scale. OpenPrescribing provides public methods and code for prescribing analyses, supporting peer scrutiny and iterative improvement. Internationally, OpenMRS and GNU Health show long-running community models for longitudinal records and public health, while the openEHR community maintains shared archetypes and templates that allow vendors to converge on common clinical content.
Risks are also highlighted. Fragmentation can occur if forks diverge without stewardship, hidden costs can surface during integration and migration, and security can be misunderstood if openness is taken as a substitute for active assurance. The reported mitigations are straightforward but non-trivial: maintainers who curate contributions, product ownership with clinical sponsorship, published roadmaps, funded support arrangements and independent security testing. Overall, the evidence supports open source as a practical route to transparency, reuse and safer interoperability, provided it is treated as a long-term programme with disciplined governance rather than a short-term cost-saving exercise.
References
Adams, J. (2020a). ‘Hugo Clinic Notes Theme’. Available at: https://jmablog.com/post/hugo-clinic-notes/ (Accessed: 5 Nov. 2025).
Adams, J. (2020b). ‘Hugo Clinic Notes’. GitHub repository. Available at: https://github.com/jmablog/hugo-clinic-notes (Accessed: 5 Nov. 2025).
Antwi, M., Adnane, A., Ahmad, F., Hussain, R., Habib ur Rehman M. and Kerrache, C.A. (2021). ‘The case of HyperLedger Fabric as a blockchain solution for healthcare applications’, Blockchain: Research and Applications, 2 (1), pp. 1-15, doi: https://doi.org/10.1016/j.bcra.2021.100012.
Burstein, A. (2020a). ‘Improving Health Care with Plain-Text Medical Records and Git’. Available at: https://www.gizra.com/content/plain-text-medical-records/ (Accessed: 5 Nov. 2025).
Burstein, A. (2020b). ‘mdr-git’. Github repository. Available at: https://github.com/amitaibu/mdr-git (Accessed: 5 Nov. 2025).
Kulantuga, J. and Szilagyi, A (2025). ‘fasten-onprem’, GitHub repository. Available at: https://github.com/fastenhealth/fasten-onprem (Accessed: 10 Nov. 2025).
Kumari, D., Parmar, A.S., Goyal, H.S., Mishra, K. and Panda S. (2024). ‘HealthRec-Chain: Patient-centric blockchain enabled IPFS for privacy preserving scalable health data’, Computer Networks, 241, p. 110223, doi: https://doi.org/10.1016/j.comnet.2024.110223.
Reen, G. S., Mohandas, M. and Venkaresan S. (2019). ‘Decentralized Patient Centric e-Health Record Management System using Blockchain and IPFS’, IEEE. Available at:https://arxiv.org/pdf/2009.14285 (Accessed: 6 Nov. 2025).
Shi S., He, D., Li, L., Khan N., Khan, M. K. and Choo, K-K. R. (2020). ‘Applications of blockchain in ensuring the security and privacy of electronic health record systems: A survey’, Computers & Security, 97, pp. 1-20. doi: https://doi.org/10.1016/j.cose.2020.101966.
Wack, M., Coulet, A., Burgun, A. and Bastien, R. (2025). ‘Enhancing clinical data warehousing with provenance data to support longitudinal analyses and large file management: The gitOmmix approach for genomic and image data’, Journal of Biomedical Informatics, 193, p. 104788, doi: https://doi.org/10.1016/j.jbi.2025.104788 (Accessed: 5 Nov. 2025).
Development Tools
This project includes comprehensive Rust formatting and linting tools to maintain code quality.
Quick Commands
# Format code
./scripts/fmt.sh
# or
cargo fmt --all
# Run linter
./scripts/lint.sh
# or
cargo clippy --all-targets --all-features -- -D warnings
# Run all quality checks
./scripts/check-all.sh
Pre-commit Hooks
The project uses pre-commit hooks to automatically check code quality before commits:
# Install pre-commit hooks
pre-commit install
# Run hooks manually on all files
pre-commit run --all-files
# Run specific hook
pre-commit run cargo-clippy
# Auto-fix formatting (manual stage)
pre-commit run cargo-fmt-fix --hook-stage manual
Configuration Files
rustfmt.toml- Code formatting configurationclippy.toml- Linting rules and thresholds.vscode/settings.json- VS Code settings for Rust development
Available Hooks
- cspell - Spell checking for code and comments
- cargo-fmt-check - Formatting validation (runs on commit)
- cargo-clippy - Linting and code analysis (runs on commit)
- cargo-check - Compilation check (runs on commit)
- cargo-fmt-fix - Auto-format code (manual stage only)
- cargo-test - Run tests (manual stage only)
VS Code Integration
The project includes VS Code settings that:
- Enable format-on-save with rustfmt
- Run clippy on save for real-time linting
- Configure proper Rust file associations
- Set up PATH for cargo/rustc tools
VPR – Versioned Patient Repository
Note: This document provides a high-level overview. For detailed technical specifications, see LLM Specification.
Purpose
- Store patient records in a version-controlled manner, using Git.
- Serve those records fast to clinicians, admins, or patients.
- Keep everything accurate, secure, and auditable.
Technology Choices
- Rust for everything (fast, safe, compiled to a single binary).
- gRPC and REST APIs for system integration (fast, typed communication between systems).
- Git as the underlying truth for documents (every version saved, nothing silently overwritten).
- File-based storage with sharded directory structure for scalability.
- Future: database projections (Postgres) and caching (Redis) for performance optimisation (planned).
Data Model
- Records are stored as YAML and Markdown files inside Git repositories, versioned automatically.
- Each patient has three separate Git repositories:
- Clinical repository: openEHR-based clinical content (observations, diagnoses, clinical letters)
- Demographics repository: FHIR-based patient demographics (name, date of birth, identifiers)
- Coordination repository (Care Coordination Repository): care coordination data (encounters, appointments, episodes, referrals) – format to be determined, may adopt FHIR ideologies
- Patient data is sharded:
patient_data/{clinical,demographics,coordination}/<s1>/<s2>/<uuid>/where s1/s2 are first 4 hex chars of UUID. - Every new change makes a new Git commit, never overwriting the old one.
- Commits can be cryptographically signed (ECDSA P-256) for authorship verification.
API
- Dual transport: gRPC (tonic) and REST (axum/utoipa).
- Create patient – initialise new patient with demographics and clinical template.
- List patients – retrieve patient list from sharded directory structure.
- Health endpoints – confirm service availability.
- API authentication via API keys (gRPC and REST when enabled).
- OpenAPI/Swagger documentation for REST endpoints.
Security
- All communication uses encryption (TLS).
- API key authentication for gRPC; REST authentication configurable.
- Optional mTLS support planned.
- Data on disk can be encrypted if required.
- Commit signing with X.509 certificates for authorship verification.
- PHI redaction in logs and metrics.
Corrections & Deletions
- Normal use is append-only (you don’t delete history).
- If wrong patient data is added:
- Prefer redaction (mark as wrong but leave audit trail).
- If legally required, remove with a special process (cryptographic erase or repo rewrite).
Performance Approach
- Sharded directory structure to maintain predictable filesystem performance.
- Clinical template seeded from validated template directory at patient creation.
- Future: database projections and caching layer for API reads (planned).
- Git operations per-patient ensure isolation and manageable repository sizes.
Reliability
- Every change tracked in Git with complete audit trail.
- Provenance (who did what and when) captured in Git commit metadata.
- Commit signatures provide cryptographic proof of authorship where configured.
- Defensive programming: validate inputs before side effects, fail fast on invalid config.
Operations
- Runs as dual-service binary (
vpr-run) or standalone gRPC/REST services. - Configured by environment variables (patient data dir, clinical template dir, RM system version, namespace, API keys, bind addresses).
- CLI tool (
vpr-cli) for administrative tasks. - Docker development environment with live reload.
- Quality checks:
./scripts/check-all.sh(fmt, clippy, check, test).
Cargo features
- A feature flag for code builds.
Features needed for a patient to view and edit their own records:
cargo build --features patient
Features needed for clinicians and admins to manage records in a multi-patient environment:
cargo build --features org
Architecture Boundaries
crates/core– Pure data operations: file/folder management, Git versioning, patient data CRUD. No API concerns.crates/api-shared– Shared utilities: Protobuf types, HealthService, authentication.crates/api-grpc– gRPC-specific implementation: VprService, interceptors.crates/api-rest– REST-specific implementation: HTTP endpoints, OpenAPI.crates/certificates– X.509 certificate generation for authentication and commit signing.crates/cli– Command-line interface for administrative operations.
Wrong patient
- Redact
- Stub
- Preserve cryptographic proof of what was removed
- Hashed Message Authentication Code (mathematical fingerprint of the original data)
- Quarantine vault
- Quarantine bytes
tombstone locally, escrow the content in a restricted space, and leave a non-revealing hash pointer for audit.
Healthcare Standards
VPR is built on two foundational healthcare standards: OpenEHR and FHIR. These standards provide complementary capabilities for clinical data management and interoperability.
Overview
OpenEHR
OpenEHR provides a vendor-independent architecture for storing and managing clinical data with built-in versioning, semantic interoperability, and clinical knowledge separation.
VPR uses OpenEHR for:
- Clinical record structure and composition model
- EHR status tracking and identity linkage
- Version-controlled clinical data management
- Archetype-based semantic definitions
Read more about OpenEHR in VPR →
FHIR
Fast Healthcare Interoperability Resources (FHIR) is a modern standard for exchanging healthcare data via RESTful APIs, with emphasis on ease of implementation and web-friendly formats.
VPR uses FHIR for:
- Coordination repository wire formats
- Messaging thread semantics (Communication resource)
- Future API projections and integrations
- Interoperability with external systems
Complementary Roles
OpenEHR and FHIR serve different but complementary purposes in VPR:
| Aspect | OpenEHR | FHIR |
|---|---|---|
| Primary focus | Long-term clinical data storage | Real-time data exchange |
| Architecture | Repository-based, versioned | API-first, resource-based |
| Granularity | Document-level (Compositions) | Element-level (Resources) |
| Versioning | Built-in, audit-focused | Optional, implementation-specific |
| Clinical modeling | Archetypes + Templates | Profiles + Implementation Guides |
| Best for | EHR systems, clinical archives | HIE, mobile apps, integrations |
VPR’s Hybrid Approach
VPR combines the strengths of both standards:
OpenEHR for Clinical Records:
- Compositions stored in clinical repository
- Full version history via Git
- Archetype-based semantic structure
- Long-term clinical archive
FHIR for Coordination:
- Communication semantics for messaging
- RESTful API patterns for future integration
- Resource-based wire formats
- Interoperability with external systems
This hybrid approach provides:
- Best-in-class storage: OpenEHR’s robust clinical data model
- Best-in-class exchange: FHIR’s practical API standards
- Future flexibility: Can project either standard externally
- Standards alignment: Both use standard terminologies (SNOMED, LOINC)
Design Principles
Semantic Preservation
VPR maintains the meaning of both standards:
- OpenEHR composition structure is preserved
- FHIR resource semantics are followed
- Mappings between standards are explicit
- No information loss in either direction
Implementation Pragmatism
VPR adapts standards for version-controlled storage:
- YAML instead of JSON/XML for human readability
- Git instead of database for version control
- File-based storage for simplicity and auditability
- Cryptographic signing for integrity
Progressive Enhancement
VPR can add standard APIs incrementally:
- Core storage model is standards-aligned
- APIs can be added without changing storage
- Multiple projections possible (OpenEHR API, FHIR API, GraphQL)
- Storage remains authoritative source
Standards Governance
OpenEHR Foundation
- Develops and maintains OpenEHR specifications
- Curates archetype repositories (Clinical Knowledge Manager)
- Provides conformance testing
- International community of users
VPR Compliance:
- Uses OpenEHR Reference Model structures
- Declares RM version in all files
- Follows composition and versioning semantics
- Compatible with OpenEHR tooling (parsers, validators)
HL7 International
- Develops and maintains FHIR specifications
- Manages terminology and code systems
- Provides implementation guides and profiles
- Large ecosystem of vendors and implementers
VPR Compliance:
- Uses FHIR resource semantics (conceptual alignment)
- Wire formats map to FHIR resources
- Can project to FHIR REST API
- Compatible with FHIR tooling (validators, servers)
Further Reading
- OpenEHR in VPR - Detailed coverage of OpenEHR usage
- FHIR in VPR - Detailed coverage of FHIR integration
- Clinical Repository Design
- Coordination Repository Design
- Technical Architecture
External Resources
OpenEHR
FHIR
OpenEHR Standard
Overview
OpenEHR is an open standard specification for electronic health records (EHR) that provides a vendor-independent, future-proof architecture for storing and managing clinical data. Developed by the OpenEHR Foundation, it separates clinical knowledge (archetypes) from technical implementation, enabling healthcare systems to evolve without requiring system rewrites.
Core Concepts
Reference Model (RM):
The Reference Model defines the stable, information structures for representing EHR data. It includes:
- Compositions: Documents or clinical encounters (e.g., discharge summaries, lab reports)
- Entries: Individual clinical statements (observations, evaluations, instructions, actions)
- Data structures: Elements, items, clusters for organizing clinical data
- Version control: Built-in versioning for all clinical data
Archetypes:
Archetypes are reusable, computable definitions of clinical concepts (e.g., “blood pressure”, “medication order”). They:
- Define the structure and constraints for specific clinical concepts
- Are vendor-neutral and language-independent
- Can be shared across systems and jurisdictions
- Are maintained in centralized repositories (Clinical Knowledge Manager)
Templates:
Templates combine multiple archetypes into specific clinical documents (e.g., “Emergency Department Admission”, “Diabetes Review”). They:
- Constrain archetypes further for specific use cases
- Define which archetypes are mandatory or optional
- Specify terminology bindings
- Configure the data collection interface
Terminology Integration:
OpenEHR supports binding to standard terminologies:
- SNOMED CT (clinical terms)
- LOINC (laboratory and clinical observations)
- ICD-10/ICD-11 (diagnoses)
- Local terminologies as needed
Problems OpenEHR Solves
1. Semantic Interoperability
Problem: Different EHR systems represent the same clinical concept in incompatible ways, making data exchange difficult and error-prone.
Solution: Archetypes provide standardized, computable definitions of clinical concepts that work across systems. A “blood pressure” archetype means the same thing regardless of vendor.
2. Vendor Lock-in
Problem: Healthcare organizations become dependent on proprietary EHR systems, making migration expensive and risky.
Solution: OpenEHR’s vendor-neutral data model allows data to be stored in a portable format. Organizations can switch systems without data conversion.
3. Clinical Knowledge Evolution
Problem: Medical knowledge evolves faster than software development cycles. Adding new clinical concepts requires expensive system updates.
Solution: Archetypes can be created, modified, and deployed independently of the underlying software. Clinicians and informaticians can define new concepts without programmer intervention.
4. Data Quality and Validation
Problem: EHR systems often allow inconsistent or invalid data entry, compromising clinical safety.
Solution: Archetypes define constraints and validation rules at the clinical knowledge level, ensuring data quality at the point of entry.
5. Longitudinal Health Records
Problem: Patient data is fragmented across multiple systems, time periods, and care settings.
Solution: OpenEHR’s version-controlled composition model maintains complete audit trails and supports lifelong health records across organizational boundaries.
6. Research and Analytics
Problem: Clinical data locked in proprietary formats is difficult to query for research and quality improvement.
Solution: OpenEHR’s structured, semantically-defined data supports sophisticated querying (via AQL - Archetype Query Language) and data extraction.
How OpenEHR is Normally Used in Digital Health
1. National EHR Programs
OpenEHR is used for national-scale EHR deployments:
- Norway: National EHR platform (Helse Vest)
- Slovenia: National EHR infrastructure
- Brazil: Public health information systems
- Russia: National digital health initiatives
These implementations provide unified clinical data repositories serving entire populations.
2. Hospital Information Systems
OpenEHR-based clinical data repositories (CDRs) serve as:
- Central clinical data stores for hospital groups
- Integration hubs connecting departmental systems
- Long-term clinical archives replacing legacy systems
3. Clinical Decision Support
OpenEHR’s structured data enables:
- Rules-based clinical decision support
- Guideline execution engines
- Drug interaction checking
- Clinical pathways automation
4. Research Data Platforms
OpenEHR supports:
- Cohort identification for clinical trials
- Observational research databases
- Quality improvement analytics
- Population health monitoring
5. Citizen Health Records
OpenEHR powers patient portals and personal health records:
- Patient-accessible health data
- Patient-entered observations (blood pressure, glucose)
- Shared decision-making tools
- Care plan tracking
6. Specialized Clinical Systems
OpenEHR is used in domain-specific applications:
- Intensive care monitoring systems
- Oncology treatment records
- Maternal and child health tracking
- Chronic disease management
How OpenEHR is Used in VPR
1. Clinical Record Structure
VPR uses OpenEHR Reference Model structures for clinical compositions:
EHR Status:
Every patient has an ehr_status.yaml file following OpenEHR’s EHR_STATUS specification:
_type: EHR_STATUS
subject:
_type: PARTY_SELF
is_queryable: true
is_modifiable: true
uid:
_type: HIER_OBJECT_ID
value: "a4f91c6d-3b2e-4c5f-9d7a-1e8b6c0a9f12"
This provides:
- Patient identity linkage
- Record queryability flags
- Modification permissions
- External references to demographics
Compositions:
Clinical documents (letters, observations) use OpenEHR COMPOSITION structure:
_type: COMPOSITIONdeclares the document typenameprovides human-readable document titlearchetype_node_ididentifies the template/archetype useduidprovides version-controlled unique identifiercontextcaptures care setting metadatacontentcontains the clinical data entries
Example composition.yaml for a clinical letter:
_type: COMPOSITION
name:
_type: DV_TEXT
value: Clinical Letter
archetype_node_id: openEHR-EHR-COMPOSITION.correspondence.v0
uid:
_type: HIER_OBJECT_ID
value: "20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000"
language:
_type: CODE_PHRASE
terminology_id:
_type: TERMINOLOGY_ID
value: ISO_639-1
code_string: en
2. Version-Controlled Repository Model
VPR adopts OpenEHR’s versioning philosophy:
Immutability:
- Clinical compositions are immutable once committed
- Changes create new versions with full audit trail
- Git provides the versioning infrastructure
- Every composition has a unique timestamp-prefixed ID
Contribution Model:
Each Git commit represents an OpenEHR CONTRIBUTION:
- Contains one or more VERSION
objects - Records who made the change (commit author)
- Records when the change occurred (commit timestamp)
- Records why the change was made (commit message)
3. Semantic Interoperability
VPR uses OpenEHR conventions for:
Reference Model Version:
All files declare their RM version for compatibility:
_rm_version: "1.1.0"
This ensures:
- Parsers know which specification to apply
- Forward/backward compatibility can be managed
- Systems can validate against the correct schema
Type Annotations:
Every complex object declares its _type for unambiguous parsing:
_type: COMPOSITION_type: DV_TEXT_type: DV_CODED_TEXT_type: PARTY_SELF
4. Clinical Data Query Support
VPR’s structured data enables OpenEHR-style querying:
Archetype paths:
Data elements are addressable via standardized paths:
/content[openEHR-EHR-OBSERVATION.blood_pressure.v2]/data/events[at0006]/data/items[at0004]/value
This allows:
- Precise data extraction
- Cross-system queries
- Research cohort identification
- Quality improvement analytics
5. Template-Based Data Collection
VPR uses OpenEHR templates for:
Clinical Document Templates:
Templates stored in crates/core/templates/clinical/ define:
- Which archetypes are included
- Mandatory vs. optional elements
- Terminology bindings
- Default values and constraints
Initialization from Templates:
When creating a new clinical record, VPR:
- Validates the template directory exists
- Copies template files to the patient’s repository
- Initializes
ehr_status.yamlwith proper structure - Commits the initial state to Git
6. Deviations from Standard OpenEHR
VPR adapts OpenEHR for a version-controlled repository model:
Storage Format:
- Uses YAML instead of JSON or XML for human readability
- One composition per file for Git-friendly diffs
- Markdown for narrative content (e.g., letter body)
Server Architecture:
- No OpenEHR REST API server
- No query engine (yet)
- File-based storage instead of database
- Git instead of versioning database
Rationale:
This provides:
- Human-readable audit trails
- Standard version control tooling
- Cryptographic signing and verification
- Distribution and replication via Git
- No runtime database dependencies
7. Future OpenEHR Integration
VPR is designed to support future OpenEHR capabilities:
Archetype Query Language (AQL):
The structured data format will support AQL queries:
SELECT
c/uid/value,
c/context/start_time,
o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value
FROM
EHR e
CONTAINS COMPOSITION c
CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v2]
WHERE
o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude > 140
API Projections:
VPR compositions can be projected to:
- OpenEHR REST API responses
- RM-compliant JSON
- Canonical XML format
- FHIR resources (via mappings)
Template Server:
Future template management:
- Operational Template (OPT) import
- Template validation
- Web-based template designer integration
- Archetype repository synchronization
References
- OpenEHR Specification
- Clinical Knowledge Manager
- OpenEHR Foundation
- Archetype Query Language (AQL)
- VPR Clinical Repository Design
FHIR Standard
Overview
Fast Healthcare Interoperability Resources (FHIR) is a modern healthcare data exchange standard developed by HL7 International. Released in 2014, FHIR combines the best features of HL7 v2, v3, and CDA while leveraging web technologies (REST, JSON, OAuth) to provide a practical, implementer-friendly approach to health data interoperability.
Core Concepts
Resources:
FHIR defines ~150 modular “resources” representing healthcare concepts:
- Clinical: Patient, Observation, Condition, Procedure, MedicationStatement
- Administrative: Encounter, Practitioner, Organization, Location
- Financial: Claim, Coverage, PaymentNotice
- Workflow: Task, Appointment, ServiceRequest
- Infrastructure: Bundle, OperationOutcome, CapabilityStatement
Each resource:
- Has a defined structure (elements and data types)
- Can be represented as JSON, XML, or RDF
- Includes human-readable narrative
- Supports extensibility via extensions
- Has a defined lifecycle and versioning model
RESTful API:
FHIR uses HTTP for all interactions:
GET /Patient/123- Read a patientPOST /Observation- Create an observationPUT /Condition/456- Update a conditionDELETE /MedicationStatement/789- Remove (or mark inactive)GET /Patient?name=Smith- Search for patients
Profiles and Implementation Guides:
FHIR can be constrained for specific use cases:
- Profiles: Constrain resources for particular jurisdictions or domains
- Implementation Guides: Collections of profiles, value sets, and documentation
- Examples: US Core, UK Core, International Patient Summary (IPS)
Terminology Integration:
FHIR supports standard terminologies:
- CodeableConcept data type for coded values
- ValueSets for allowed codes
- ConceptMaps for code translation
- Built-in support for SNOMED CT, LOINC, RxNorm, ICD-10, etc.
Extensions:
FHIR allows extending resources without breaking compatibility:
- Standard extensions (e.g., patient ethnicity, race)
- Local extensions for organization-specific needs
- Extensions can be profiled and constrained
Problems FHIR Solves
1. API-First Health Data Exchange
Problem: Legacy standards (HL7 v2, CDA) weren’t designed for modern web APIs, making integration complex and expensive.
Solution: FHIR uses RESTful HTTP APIs that web developers understand. OAuth 2.0 for security, JSON for data format, and standard HTTP verbs make integration straightforward.
2. Implementation Complexity
Problem: HL7 v3 and CDA were powerful but extremely complex, leading to inconsistent implementations and high development costs.
Solution: FHIR prioritizes the “80% use case” with simple, practical designs. Complex scenarios are supported but don’t burden simple implementations.
3. Granular Data Access
Problem: Document-based standards (CDA) require exchanging entire documents when only specific data elements are needed.
Solution: FHIR resources are granular (e.g., single Observation for one vital sign). Systems retrieve only what they need, reducing bandwidth and processing overhead.
4. Mobile and Consumer Health
Problem: Legacy standards weren’t designed for patient-facing applications or mobile devices.
Solution: FHIR’s lightweight JSON format, RESTful APIs, and OAuth security work naturally with mobile apps and patient portals. SMART on FHIR enables app ecosystems.
5. Real-Time Clinical Decision Support
Problem: Batch-oriented standards delay clinical decision support until data is processed and stored.
Solution: FHIR’s API model supports real-time CDS Hooks—contextual cards that appear during clinical workflow without disrupting the EHR.
6. Data Heterogeneity
Problem: Healthcare data comes in many forms (structured, narrative, images, documents), and legacy standards handle some better than others.
Solution: FHIR resources accommodate:
- Structured coded data (Observation with LOINC codes)
- Narrative text (DomainResource.text)
- Binary data (DocumentReference, Media)
- Mixed content (DiagnosticReport with narrative + structured results)
7. International Adoption
Problem: Different countries have different healthcare models, terminologies, and regulations, making global standards difficult.
Solution: FHIR’s profiling mechanism allows local adaptation while maintaining core compatibility. US Core, UK Core, Australian Base, and others all build on the same foundation.
How FHIR is Normally Used in Digital Health
1. Health Information Exchange (HIE)
FHIR enables data sharing across organizations:
- Query-based exchange: Pull patient data from other systems when needed
- Subscription-based exchange: Get notified when patient data changes
- Bulk data export: Extract large datasets for research or migration
- National networks: CommonWell, Carequality (US), Summary Care Record (UK)
2. Patient Access to Health Records
FHIR powers patient-facing applications:
- Patient portals: View records, request appointments, message providers
- Mobile health apps: Apple Health, Google Fit integration
- SMART on FHIR apps: Patient selects apps that access their EHR data
- Blue Button 2.0: US Medicare beneficiaries download their claims data
3. Provider Access to External Data
FHIR brings outside data into clinical workflow:
- CDS Hooks: Real-time clinical decision support during ordering
- SMART on FHIR: Clinician-facing apps launch from within EHR
- Payer data exchange: Claims history informs clinical care
- Social determinants: Community resource directories, housing, food access
4. Clinical Research and Registries
FHIR supports research data collection:
- HL7 FHIR Bulk Data: Extract cohorts for research studies
- REDCap on FHIR: Capture study data in FHIR format
- Quality registries: Automated reporting to cancer, cardiac registries
- Phenotyping: Identify eligible patients for trials
5. Population Health and Value-Based Care
FHIR enables population-level analytics:
- Risk stratification: Identify high-risk patients for intervention
- Gap closure: Find patients missing preventive care
- Care coordination: Track care plan execution across providers
- Quality measurement: Automated HEDIS, CQM reporting
6. Public Health Reporting
FHIR modernizes public health surveillance:
- Electronic case reporting (eCR): Automated notifiable disease reporting
- Immunization forecasting: Calculate due/overdue vaccines
- Lab result reporting: ELR via FHIR Observation
- COVID-19 reporting: Vaccine administration, case reports, lab results
7. Payer-Provider Data Exchange
FHIR improves administrative efficiency:
- Prior authorization: Check coverage and submit auth requests via FHIR
- Formulary checking: Real-time medication coverage lookup
- Claims attachments: Send supporting documentation with claims
- Coverage discovery: Find patient’s insurance coverage
8. Clinical Decision Support
FHIR enables evidence-based care:
- CDS Hooks: Cards appear at the right time (e.g., “Consider diabetes screening”)
- Order sets: FHIR RequestGroup for protocol-driven ordering
- Care plans: FHIR CarePlan for chronic disease management
- Drug interaction checking: FHIRcast for real-time prescription review
How FHIR is Used in VPR
1. Wire Format for Coordination Data
VPR uses FHIR-aligned wire formats for coordination repository metadata:
Conceptual Alignment, Not Implementation:
VPR does not implement:
- FHIR REST APIs
- FHIR JSON or XML formats
- FHIR resource validation
- FHIR server capabilities
Instead, VPR uses FHIR semantics in YAML wire formats:
COORDINATION_STATUS.yaml:
Tracks coordination repository lifecycle:
coordination_id: "7f4c2e9d-4b0a-4f3a-9a2c-0e9a6b5d1c88"
clinical_id: "a4f91c6d-3b2e-4c5f-9d7a-1e8b6c0a9f12"
status:
lifecycle_state: active
record_open: true
record_queryable: true
record_modifiable: true
This corresponds conceptually to resource status tracking in FHIR.
Thread ledger.yaml:
Messaging thread metadata uses FHIR Communication resource semantics:
communication_id: 20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000
status: open # Maps to Communication.status
participants:
- participant_id: 4f8c2a1d-9e3b-4a7c-8f1e-6b0d-2c5a9f12
role: clinician # Maps to Communication.recipient
display_name: Dr Jane Smith
Key mappings:
communication_id→Communication.identifierstatus→Communication.status(open=in-progress, closed=completed, archived=stopped)participants→Communication.recipientarraycreated_at→Communication.sentvisibility.sensitivity→Communication.meta.security
2. FHIR Module in VPR Core
The fhir crate provides wire format handling:
Module: fhir::CoordinationStatus
#![allow(unused)]
fn main() {
// Parse COORDINATION_STATUS.yaml
let status_data = fhir::CoordinationStatus::parse(yaml_text)?;
// Render to YAML
let yaml = fhir::CoordinationStatus::render(&status_data)?;
}
Domain types:
CoordinationStatusData- Top-level structureStatusInfo- Status detailsLifecycleState- Active, Suspended, Closed
Module: fhir::Messaging
#![allow(unused)]
fn main() {
// Parse thread ledger.yaml
let ledger_data = fhir::Messaging::ledger_parse(yaml_text)?;
// Render to YAML
let yaml = fhir::Messaging::ledger_render(&ledger_data)?;
}
Domain types:
LedgerData- Thread metadataThreadStatus- Open, Closed, ArchivedLedgerParticipant- Participant with roleParticipantRole- Clinician, Patient, CareTeam, System
3. Semantic Preservation for Future Projections
VPR’s FHIR-aligned design enables future conversions:
FHIR Communication Projection:
VPR messaging threads can be projected to FHIR Communication resources:
{
"resourceType": "Communication",
"id": "20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000",
"status": "in-progress",
"sent": "2026-01-11T14:35:22.045Z",
"recipient": [
{
"reference": "Practitioner/4f8c2a1d-9e3b-4a7c-8f1e-6b0d-2c5a9f12",
"display": "Dr Jane Smith"
}
],
"payload": [
{
"contentString": "Patient has reported increasing shortness of breath..."
}
]
}
FHIR Task Projection:
Future coordination tasks could map to FHIR Task resources:
Task.status- requested, accepted, in-progress, completedTask.intent- order, plan, proposalTask.code- Type of taskTask.for- Patient referenceTask.owner- Responsible practitionerTask.requester- Who requested the task
FHIR DocumentReference:
OpenEHR compositions could be exposed as FHIR DocumentReference:
{
"resourceType": "DocumentReference",
"status": "current",
"type": {
"coding": [
{
"system": "http://loinc.org",
"code": "34133-9",
"display": "Summary of episode note"
}
]
},
"content": [
{
"attachment": {
"contentType": "application/yaml",
"url": "/clinical/a4/f9/a4f91c6d.../composition.yaml"
}
}
]
}
4. API Gateway Projection
VPR can expose FHIR APIs via an API gateway:
REST API (future):
GET /fhir/Communication?subject=Patient/123
GET /fhir/Patient/123
POST /fhir/Communication
PUT /fhir/Communication/456
The API gateway would:
- Receive FHIR REST requests
- Translate to VPR operations
- Execute on Git-based repository
- Project results to FHIR format
- Return FHIR responses
GraphQL API (future):
query {
patient(id: "123") {
name
communications {
sent
sender
payload
}
}
}
5. Terminology Binding
VPR uses FHIR’s approach to coded data:
Participant Roles:
role: clinician # Maps to FHIR ParticipantRole value set
Future binding to standard terminologies:
- SNOMED CT for clinical concepts
- LOINC for observations and documents
- Local code systems for organization-specific concepts
Visibility/Sensitivity:
sensitivity: confidential # Maps to FHIR security labels
Alignment with:
http://terminology.hl7.org/CodeSystem/v3-Confidentiality- Values: N (normal), R (restricted), V (very restricted)
6. FHIR Bulk Data Export
VPR’s Git-based storage supports bulk data patterns:
Patient-level export:
GET /fhir/$export?_type=Communication,Observation,Condition
Would generate:
- NDJSON files with FHIR resources
- Parallel processing of patient repositories
- Streaming output via polling pattern
Group-level export:
GET /fhir/Group/high-risk-patients/$export
Cohort definition → Git repository query → FHIR resource generation
7. SMART on FHIR Integration
VPR can support SMART app launches:
Standalone Launch:
- App redirects to VPR authorization endpoint
- User authenticates and authorizes scopes
- App receives access token
- App queries VPR FHIR API
EHR Launch:
- EHR launches SMART app with context (patient, encounter)
- App exchanges launch token for access token
- App queries VPR for contextual data
Scopes:
patient/Communication.read- Read patient’s messagespatient/Observation.read- Read patient’s observationsuser/Practitioner.read- Read clinician’s profilelaunch/patient- Patient context available
8. CDS Hooks Integration
VPR could provide clinical decision support:
Hook: patient-view
Triggered when clinician opens patient chart:
{
"hookInstance": "abc123",
"hook": "patient-view",
"context": {
"patientId": "123",
"userId": "Practitioner/456"
}
}
VPR could return cards suggesting:
- Unread messages in coordination threads
- Overdue care plan activities
- Missing documentation
9. FHIR Subscriptions
VPR could support change notifications:
Subscription creation:
{
"resourceType": "Subscription",
"status": "requested",
"criteria": "Communication?subject=Patient/123",
"channel": {
"type": "rest-hook",
"endpoint": "https://example.org/webhook",
"payload": "application/fhir+json"
}
}
Git post-receive hooks could trigger subscription notifications.
10. Deviations from Standard FHIR
VPR adapts FHIR concepts for version-controlled storage:
Storage:
- Git repositories, not FHIR server databases
- YAML wire formats, not JSON/XML
- File-based, not API-first
Versioning:
- Git commits, not FHIR resource versions
- Immutable files, not REST versioning
- Complete history always available
Search:
- File system traversal, not FHIR search parameters (yet)
- Git log queries, not database queries
- Future: AQL or FHIR search translation layer
Transactions:
- Git atomic commits, not FHIR Bundle transactions
- Repository-level consistency, not resource-level
Rationale:
This provides:
- Human-readable audit trails
- Cryptographic signing and verification
- Distributed version control
- No runtime database dependencies
- Standard tooling (Git, text editors)
Future FHIR Integration
VPR’s FHIR-aligned design supports progressive enhancement:
Near-term (Phase 1)
- REST API gateway: Expose FHIR resources via HTTP
- Read-only operations: GET for Communication, Patient, Practitioner
- Basic search:
?subject,?date,?statusparameters - SMART on FHIR: OAuth 2.0 authorization for app access
Medium-term (Phase 2)
- Write operations: POST, PUT for creating/updating resources
- Bulk data export: System-level and patient-level export
- FHIR Subscriptions: Webhook notifications for changes
- Advanced search: Full FHIR search parameter support
Long-term (Phase 3)
- CDS Hooks: Real-time clinical decision support integration
- FHIR Questionnaire: Structured data collection forms
- GraphQL API: Flexible querying alternative to REST
- FHIR Mapping Language: Automated OpenEHR ↔ FHIR translation
References
- FHIR Specification
- FHIR Resource List
- SMART on FHIR
- CDS Hooks
- FHIR Bulk Data Access
- US Core Implementation Guide
- VPR FHIR Integration
- VPR Coordination Repository
Technical
See Design Decisions for more information on architecture and design choices.
Containers
Docker
Language
Rust
APIs
VPR provides two API interfaces for accessing patient records:
gRPC API
High-performance, type-safe API using Protocol Buffers and tonic.
- Port:
50051 - Protocol: HTTP/2 + Protocol Buffers
- Authentication: API key via
x-api-keyheader - See gRPC API Documentation
To start the grpcui viewer:
j g
REST API
HTTP/JSON API with OpenAPI documentation and Swagger UI.
- Port:
3000 - Protocol: HTTP/JSON
- Interactive documentation:
http://localhost:3000/swagger-ui/ - See REST API Documentation
Linting
Rust Clippy markdownlint
Spelling
cspell
Pre-commit
pre-commit
Crate Separation
The VPR project uses a modular crate structure to maintain clear separation of concerns and enforce architectural boundaries:
Core Crates
-
crates/core(vpr-core): Contains pure data operations only. Handles file/folder management, patient repositories (clinical, demographics, coordination), and Git-based versioning. No API concerns. Provides the foundational services:ClinicalService,DemographicsService,CoordinationService,PatientService. -
crates/files(vpr-files): Content-addressed file storage for binary attachments. Implements SHA-256-based immutable file storage with two-level sharding. Used by clinical repository for letter attachments. -
crates/uuid(vpr-uuid): UUID generation and sharding utilities. ProvidesShardableUuidfor creating two-level sharded directory structures. -
crates/fhir: FHIR-aligned data types and enums. ProvidesMessageAuthor,AuthorRole,ThreadStatus,SensitivityLevel,LifecycleStatefor care coordination. -
crates/openehr: OpenEHR data structures and validation. Used for clinical content modeling. -
crates/certificates(vpr-certificates): X.509 certificate generation and validation for professional registrations. Supports ECDSA P-256 cryptographic signing.
API Crates
-
crates/api-shared(api-shared): Shared utilities and definitions for both APIs. Includes Protocol Buffer definitions (vpr.proto), message types, and common authentication utilities. -
crates/api-grpc(api-grpc): gRPC-specific implementation. UsesVprServicewith authentication interceptors and tonic integration. All RPCs delegate to services fromvpr-core. -
crates/api-rest(api-rest): REST-specific implementation. Provides HTTP endpoints with OpenAPI/Swagger UI via axum and utoipa. All handlers delegate to services fromvpr-core.
CLI and Deployment
-
crates/cli(vpr-cli): Command-line interface. Provides comprehensive CLI commands for all patient record operations. Directly uses services fromvpr-core. -
src/main.rs(vpr-run): Deployment binary that runs both gRPC and REST servers concurrently.
This separation ensures that data logic remains isolated from API specifics, making the codebase maintainable, testable, and allowing multiple deployment configurations from the same core.
Design decisions
This document captures the key architectural and governance decisions behind VPR, and the reasoning for each. The emphasis throughout is on auditability, clinical accountability, privacy, and long-term robustness.
A standing keystone for every decision is the pairing of patient-first intent with human-readable files as the canonical record. VPR should make patient agency primary while keeping the record legible, portable, and auditable as plain files.
File layouts can be seen at openEHR file structure.
Separation of demographics, clinical, and coordination data
VPR stores patient demographics, clinical data, and coordination data in separate repositories.
- The demographics repository (equivalent to a Master Patient Index) contains personal identifiers such as name, date of birth, and national identifiers.
- The clinical repository contains all medical content, including observations, diagnoses, clinical letters, and results.
- The coordination repository (Care Coordination Repository) contains administrative and care coordination information such as encounters, episodes of care, appointments, and referrals.
The demographics repository is linked to the clinical repository via a reference stored in ehr_status.subject.external_ref. The coordination repository references both demographics (for patient identity) and clinical records (for clinical context).
This design follows established openEHR principles and provides several benefits:
- Clinical data can be shared, versioned, and audited independently of personally identifiable information.
- Coordination data (appointments, referrals, encounters) can be managed separately from clinical content, allowing administrative workflows to evolve independently.
- Privacy risks are reduced by minimising the spread of identifiers.
- Systems remain modular, allowing demographics, clinical, and coordination services to evolve separately.
In practice:
- FHIR is used for demographics.
- openEHR is used for structured clinical data.
- Coordination data format is to be determined (may adopt FHIR ideologies for encounters, appointments, episodes).
Reference:
https://specifications.openehr.org/releases/1.0.1/html/architecture/overview/Output/design_of_ehr.html
Sharded directory structure
VPR uses sharded directory layouts to maintain predictable filesystem performance as the number of patient repositories grows.
Rather than placing all repositories in a single directory, repositories are distributed across subdirectories derived from a UUID prefix or hash. This avoids filesystem bottlenecks, improves lookup performance, and keeps Git operations efficient at scale.
Sharding ensures that the system remains performant and manageable even with very large numbers of patient records.
Testing strategy
VPR’s core functionality depends on real filesystem behaviour. As a result, tests are designed to interact with actual temporary directories, not mocked filesystems.
Using crates such as tempfile, tests create isolated, automatically cleaned-up directories that closely mirror production behaviour. This allows tests to validate:
- directory creation and layout,
- Git repository initialisation,
- file permissions and naming,
- serialisation and cleanup behaviour.
This approach keeps tests realistic while remaining safe, reproducible, and free from side effects on the developer’s machine.
Error handling: bespoke enums over anyhow
VPR uses bespoke error enums (for example PatientError in the core crate) rather than using anyhow::Result throughout.
This is a deliberate choice. In a clinical record system, failures are not just “an error message”: they often need to be handled consistently, audited, and mapped to user-facing outcomes.
Why bespoke enums
- Stable failure contract: A named enum defines the set of failure modes VPR considers meaningful (for example invalid input, YAML parse failure, Git initialisation failure). This makes behaviour predictable as the code evolves.
- Structured handling at boundaries: API layers (gRPC/REST) can map specific error variants to appropriate status codes and responses without relying on string matching.
- Better testability: Tests can assert specific variants rather than brittle message strings, which improves confidence during refactors.
- Separates domain intent from library detail: An enum can express domain-relevant failures while still carrying underlying errors where useful.
What we lose by not using anyhow everywhere
- Less convenience:
anyhowis excellent for rapid development and rich, contextual error chains with minimal boilerplate. - More plumbing: Explicit enums require writing variants and conversion/mapping code.
Where anyhow can still be appropriate
At application entrypoints (for example a CLI binary), anyhow can still be a good fit for turning errors into high-quality diagnostics and an exit code. VPR keeps this style out of the core library surface so that upstream layers can make deterministic decisions based on typed errors.
Defensive programming as a clinical safety requirement
VPR treats defensive programming as a baseline requirement for clinical-safe systems.
Clinical record software must behave predictably under bad inputs, misconfiguration, partial filesystem failures, or unexpected environmental state. In this context, “defensive” means prioritising safe failure and auditability over convenience.
In practice, VPR follows these principles:
- Validate before side effects: inputs and configuration are checked up-front wherever feasible, before creating directories, writing files, or initialising repositories.
- Bounded work: operations that could otherwise become unbounded (for example retries, directory traversal, or template copying) are explicitly limited to prevent pathological behaviour.
- No silent fallbacks: invalid configuration or malformed inputs return a typed error rather than being coerced into a “best guess”.
- Explicit error contracts: failures are represented as named enum variants (for example
PatientError) to support consistent handling at API boundaries and reliable testing. - Best-effort rollback with surfaced failures: when partial work has been done, VPR attempts to clean up, and treats cleanup failures as meaningful (not something to quietly ignore).
Concrete examples of defensive measures include:
- Limiting UUID allocation retries when allocating a new patient directory.
- Performing preflight checks (for example template resolution and safety checks) before creating patient directories.
- Rejecting unsafe filesystem entries (such as symlinks) and applying size/depth limits when copying templates to avoid accidental “copy the world” behaviours.
- Returning a distinct error when initialisation fails and cleanup also fails, so operators can detect and investigate residual on-disk state.
These practices reduce the likelihood of corrupted or ambiguous record state, improve operational visibility when something goes wrong, and keep clinical behaviour deterministic.
Signed Git commits in VPR (summary)
VPR uses cryptographically signed Git commits to provide immutable, auditable authorship of clinical records.
For signed commits, VPR embeds a self-contained cryptographic payload directly in the commit object, not as files in the repository. This payload includes:
- an ECDSA P-256 signature over the canonical commit content,
- the author’s public signing key,
- an optional X.509 certificate issued by a trusted authority (for example a professional regulator).
The private key is generated and held by the author and is never shared or stored in the repository.
Because all verification material is attached to the commit itself, signed VPR commits can be verified offline, years later, without reliance on external services. Each commit therefore acts as a sealed attestation linking the clinical change to a named, accountable professional identity.
Why X.509 certificates
VPR mandates the use of X.509 certificates for commit signing.
X.509 is the same widely adopted standard used for:
- secure web traffic (Transport Layer Security),
- encrypted email,
- enterprise public key infrastructure,
- regulated identity systems.
Each certificate binds a public key to a verified identity and supports expiry and revocation, making it suitable for regulated healthcare environments.
Other signing mechanisms were deliberately rejected:
- SSH keys lack identity assurance, expiry, and revocation.
- GPG relies on a decentralised web-of-trust model that does not align with formal clinical governance.
X.509 provides a hierarchical, auditable trust model that fits naturally with healthcare regulation and organisational identity management.
X.509 in the NHS (context)
In the NHS, X.509 certificates are primarily used for identity and authentication, not for signing individual clinical entries.
The trust anchor is the NHS Public Key Infrastructure, operated nationally.
Key uses include:
- NHS smartcards, which authenticate clinicians as known individuals.
- Role-based access control, where identity is established first and permissions applied separately.
- Access to national services such as demographic services and summary care records.
- System-to-system communication using mutual Transport Layer Security.
- Formal electronic signatures for legal or regulatory workflows.
VPR builds on this familiar model but applies X.509 certificates to authorship of clinical record changes, rather than to login or transport security.
The patient’s voice
VPR supports both professional clinical entries and patient contributions within the same repository, using distinct artefact paths:
/clinical/contains authoritative, professionally authored and signed records./patient/contains patient-contributed material such as reported outcomes, symptom logs, or uploaded documents.
Patient input may inform clinical care, but it never overwrites clinical records without explicit professional review and a new signed commit.
This preserves patient voice while maintaining clinical accountability.
Single-branch repository policy
Each VPR repository uses a single authoritative branch: refs/heads/main.
While Git itself allows multiple branches, VPR enforces a single-branch policy at the system level. Branches may exist transiently during local operations, but only main is accepted as authoritative.
This ensures a single, linear clinical history and avoids ambiguity about competing versions of truth.
File format conventions in VPR
VPR uses different on-disk file formats depending on the nature of the clinical information, not based on technical fashion. The guiding principle is to optimise for human readability, auditability, and safe review, while remaining fully interoperable via APIs.
Rule of thumb
Choose the file format based on how the information is used and reviewed.
-
Narrative clinical content
(for example medical histories, clinic letters, discharge summaries, clinical reasoning)
→ Markdown with YAML front matter -
Structured clinical measurements
(for example observations, blood tests, vital signs, scores)
→ YAML -
Machine-dense or high-volume data
(for example large panels, waveforms, derived analytics outputs)
→ YAML by default; JSON only if interoperability tooling absolutely requires it -
APIs and external integrations (REST/gRPC, internet-facing)
→ JSON by default (wire format and payload shape). Use YAML/Markdown only for offline/shared on-disk artefacts we control end-to-end, not for internet APIs.
Rationale
- Markdown preserves clinical narrative, nuance, and intent, and produces clear, reviewable Git diffs.
- YAML is human-readable, diff-friendly, and well suited to structured clinical data that may need manual review or audit.
- YAML is the preferred structured format when human review in Git matters; only fall back to JSON when an external consumer requires it.
- JSON remains available for interoperability edge cases, but should be avoided when YAML/Markdown will suffice.
- APIs use JSON for internet-facing REST/gRPC. YAML/Markdown stay for on-disk/shared artefacts or tightly controlled internal flows, not for public API payloads.
This approach keeps clinical records legible to clinicians, robust under version control, and straightforward to serialise for external systems. The underlying data model remains the same regardless of file format; only the on-disk representation differs.
Data flow and query model in VPR
VPR is designed around a clear separation between clinical truth, performance, and user experience. This separation is deliberate and underpins the system’s safety, auditability, and scalability.
Canonical source of truth
VPR stores clinical truth in Git-backed files. These files (YAML and Markdown with YAML front matter) are the authoritative record of what was written, by whom, and when.
Git provides:
- a complete, immutable history of change
- authorship and provenance
- the ability to reconstruct record state at any point in time
These files are optimised for correctness, audit, and human review, not for fast querying.
Interpretation into typed components
When files change, VPR:
- Reads the updated files
- Parses them into typed Rust components (the internal representation of clinical meaning)
These components are the semantic pivot of the system. They represent what the system understands clinically, independent of file format, Git, databases, or APIs.
Projection into databases and caches
Typed components are then projected into databases and caches to support:
- indexing
- fast search
- filtering
- aggregation
- responsive user interfaces
Databases and caches store derived representations, not the canonical files themselves. They exist to answer questions efficiently, not to define truth.
Serving user-facing queries
All interactive user queries are served from:
- databases
- search indexes
- caches
Git and on-disk files are not queried on the hot path. This keeps the user experience fast and predictable, even as the canonical record remains careful and auditable.
CQRS principles in VPR
This architecture follows the core principles of Command Query Responsibility Segregation (CQRS):
-
Commands (writes)
Change clinical state by creating or modifying Git-backed files.
This path is slow, deliberate, validated, and fully auditable. -
Queries (reads)
Retrieve current, useful views of the data from database projections and caches.
This path is fast, flexible, and optimised for user needs.
The write model (files + Git) and the read model (databases + caches) are intentionally different and evolve independently.
A useful mental model
A simple way to think about the system is:
Git-backed files describe what happened; databases describe what is currently useful to know.
Both are essential. They answer different questions and are optimised for different purposes.
Summary
- Git-backed files are the canonical clinical record
- Rust components represent interpreted clinical meaning
- Databases and caches provide fast, queryable projections
- User-facing queries never depend on Git or raw files
- CQRS-style separation keeps the system auditable, performant, and safe
This design allows VPR to combine strong clinical governance with a responsive modern user experience, without compromising either.
APIs
gRPC API
The VPR gRPC API provides high-performance, type-safe access to all patient record operations.
Overview
The gRPC API is built using:
- tonic 0.12 - Rust gRPC framework
- Protocol Buffers - For message serialization
- Authentication - API key-based authentication via x-api-key header
Service Definition
The API is defined in crates/api-shared/vpr.proto.
Service: VPR
All RPC methods are grouped under the vpr.v1.VPR service.
Authentication
All requests require an x-api-key header:
grpcurl -H 'x-api-key: YOUR_API_KEY' localhost:50051 vpr.v1.VPR/Health
The API key is configured via the API_KEY environment variable.
Available RPCs
Health Check
Health- Returns service health status
Patient Management
CreatePatient- Creates a new patient record (legacy)ListPatients- Lists all patientsInitialiseFullRecord- Creates complete patient record (demographics, clinical, coordination)
Demographics
InitialiseDemographics- Initialises new demographics repositoryUpdateDemographics- Updates patient demographics (given names, last name, birth date)
Clinical
InitialiseClinical- Initialises new clinical repositoryLinkToDemographics- Links clinical repository to demographics via EHR statusNewLetter- Creates new clinical letter with markdown contentReadLetter- Retrieves letter content and metadataNewLetterWithAttachments- Creates letter with binary file attachmentsGetLetterAttachments- Retrieves letter attachments (metadata and binary content)
Coordination
InitialiseCoordination- Initialises new coordination repositoryCreateThread- Creates messaging thread with participantsAddMessage- Adds message to existing threadReadCommunication- Reads thread with ledger and all messagesUpdateCommunicationLedger- Updates thread participants, status, visibilityUpdateCoordinationStatus- Updates coordination lifecycle state and flags
Example Usage with grpcurl
Create Full Patient Record
grpcurl -plaintext -import-path crates/api-shared -proto vpr.proto \
-d '{
"given_names": ["Emily"],
"last_name": "Davis",
"birth_date": "1985-03-20",
"author_name": "Dr. Robert Brown",
"author_email": "robert.brown@example.com",
"author_role": "Clinician",
"author_registrations": [{"authority": "GMC", "number": "5555555"}],
"care_location": "City General Hospital"
}' \
-H 'x-api-key: YOUR_API_KEY' \
localhost:50051 vpr.v1.VPR/InitialiseFullRecord
Create Letter
grpcurl -plaintext -import-path crates/api-shared -proto vpr.proto \
-d '{
"clinical_uuid": "a701c3a94bf34a939d831d6183a78734",
"author_name": "Dr. Sarah Johnson",
"author_email": "sarah.johnson@example.com",
"author_role": "Clinician",
"author_registrations": [{"authority": "GMC", "number": "7654321"}],
"care_location": "GP Clinic",
"content": "# Consultation\\n\\nPatient presented with hypertension."
}' \
-H 'x-api-key: YOUR_API_KEY' \
localhost:50051 vpr.v1.VPR/NewLetter
Create Letter with Attachments
Binary attachments are sent as base64-encoded bytes:
# Encode file to base64
base64 -i /path/to/file.pdf
grpcurl -plaintext -import-path crates/api-shared -proto vpr.proto \
-d '{
"clinical_uuid": "a701c3a94bf34a939d831d6183a78734",
"author_name": "Dr. Chen",
"author_email": "chen@example.com",
"author_role": "Clinician",
"care_location": "Hospital Lab",
"attachment_files": ["<base64_content>"],
"attachment_names": ["lab_results.pdf"]
}' \
-H 'x-api-key: YOUR_API_KEY' \
localhost:50051 vpr.v1.VPR/NewLetterWithAttachments
Create Communication Thread
grpcurl -plaintext -import-path crates/api-shared -proto vpr.proto \
-d '{
"coordination_uuid": "da7e89a2a51647db89430dc3a781abb0",
"author_name": "Dr. Brown",
"author_email": "brown@example.com",
"author_role": "Clinician",
"care_location": "City Hospital",
"participants": [
{"id": "a701c3a94bf34a939d831d6183a78734", "name": "Dr. Brown", "role": "clinician"},
{"id": "d4c6547ee14a4255a568aa66d7335561", "name": "Emily Davis", "role": "patient"}
],
"initial_message_body": "Consultation scheduled.",
"initial_message_author": {
"id": "a701c3a94bf34a939d831d6183a78734",
"name": "Dr. Brown",
"role": "clinician"
}
}' \
-H 'x-api-key: YOUR_API_KEY' \
localhost:50051 vpr.v1.VPR/CreateThread
Message Types
Key message types defined in the protocol:
Author Registration
message AuthorRegistration {
string authority = 1; // e.g., "GMC", "NMC"
string number = 2; // Registration number
}
Message Author
message MessageAuthor {
string id = 1; // UUID
string name = 2; // Display name
string role = 3; // clinician, patient, system, etc.
}
Lifecycle States
Coordination lifecycle states:
active- Operational and accepting updatessuspended- Temporarily inactiveclosed- Permanently closed
Thread statuses:
open- Active communicationclosed- Concluded communicationarchived- Historical record
Sensitivity levels:
standard- Normal clinical communicationconfidential- Elevated privacyrestricted- Highest privacy level
Server Configuration
The gRPC server runs on port 50051 by default. Configuration via environment variables:
VPR_ADDR- Server bind address (default:0.0.0.0:50051)API_KEY- Required API key for authenticationVPR_ENABLE_REFLECTION- Enable gRPC reflection (default:false)RUST_LOG- Logging configuration
Implementation
The gRPC service is implemented in crates/api-grpc/src/service.rs.
Key characteristics:
- Authentication interceptor - Validates API key on all requests
- Author construction - Builds
Authorobjects from proto fields - Error handling - Maps Rust errors to gRPC status codes
- File handling - Writes attachments to temp directory, uses FilesService, cleans up
- Type conversions - Converts string enums to Rust enums (AuthorRole, ThreadStatus, etc.)
Error Handling
gRPC status codes used:
OK- SuccessUNAUTHENTICATED- Invalid or missing API keyINVALID_ARGUMENT- Invalid input parametersNOT_FOUND- Resource not foundINTERNAL- Server error
Error messages include descriptive details for debugging.
Related Documentation
REST API
The VPR REST API provides HTTP/JSON access to patient record operations with OpenAPI documentation.
Overview
The REST API is built using:
- axum 0.7 - Rust web framework
- utoipa 4.x - OpenAPI specification and Swagger UI generation
- JSON - Request and response format
Base URL
http://localhost:3000
Interactive Documentation
Swagger UI is available at:
http://localhost:3000/swagger-ui/
This provides interactive API documentation where you can test endpoints directly.
Authentication
Currently, the REST API does not require authentication (unlike the gRPC API). This is subject to change in future versions.
Available Endpoints
Health Check
GET /health- Returns service health status
Patient Management
POST /patients/full- Creates complete patient record (demographics, clinical, coordination)
Demographics
POST /demographics- Initialises new demographics repositoryPUT /demographics/:id- Updates patient demographics
Clinical
POST /clinical- Initialises new clinical repositoryPOST /clinical/:id/link- Links clinical repository to demographicsPOST /clinical/:id/letters- Creates new letterGET /clinical/:id/letters/:letter_id- Retrieves letter content
Coordination
POST /coordination- Initialises new coordination repository
Example Usage with curl
Create Full Patient Record
curl -X POST http://localhost:3000/patients/full \
-H 'Content-Type: application/json' \
-d '{
"given_names": ["Emily"],
"last_name": "Davis",
"birth_date": "1985-03-20",
"author": {
"name": "Dr. Robert Brown",
"email": "robert.brown@example.com",
"role": "Clinician",
"registrations": [{"authority": "GMC", "number": "5555555"}],
"care_location": "City General Hospital"
}
}'
Response:
{
"demographics_uuid": "d4c6547ee14a4255a568aa66d7335561",
"clinical_uuid": "a701c3a94bf34a939d831d6183a78734",
"coordination_uuid": "da7e89a2a51647db89430dc3a781abb0"
}
Initialise Demographics
curl -X POST http://localhost:3000/demographics \
-H 'Content-Type: application/json' \
-d '{
"author": {
"name": "Dr. Jane Smith",
"email": "jane.smith@example.com",
"role": "Clinician",
"registrations": [{"authority": "GMC", "number": "1234567"}],
"care_location": "St. Mary'\''s Hospital"
}
}'
Update Demographics
curl -X PUT http://localhost:3000/demographics/d4c6547ee14a4255a568aa66d7335561 \
-H 'Content-Type: application/json' \
-d '{
"given_names": ["Emily", "Rose"],
"last_name": "Davis",
"birth_date": "1985-03-20"
}'
Initialise Clinical Repository
curl -X POST http://localhost:3000/clinical \
-H 'Content-Type: application/json' \
-d '{
"author": {
"name": "Dr. Robert Brown",
"email": "robert.brown@example.com",
"role": "Clinician",
"care_location": "City Hospital"
}
}'
Link Clinical to Demographics
curl -X POST http://localhost:3000/clinical/a701c3a94bf34a939d831d6183a78734/link \
-H 'Content-Type: application/json' \
-d '{
"demographics_uuid": "d4c6547ee14a4255a568aa66d7335561",
"author": {
"name": "Dr. Brown",
"email": "brown@example.com",
"role": "Clinician",
"care_location": "City Hospital"
},
"namespace": "example.org"
}'
Create Letter
curl -X POST http://localhost:3000/clinical/a701c3a94bf34a939d831d6183a78734/letters \
-H 'Content-Type: application/json' \
-d '{
"content": "# Consultation Note\n\nPatient presented with hypertension.",
"author": {
"name": "Dr. Sarah Johnson",
"email": "sarah.johnson@example.com",
"role": "Clinician",
"registrations": [{"authority": "GMC", "number": "7654321"}],
"care_location": "GP Clinic"
}
}'
Response:
{
"timestamp_id": "20260125T125621.563Z-8d263432-d614-4d51-8611-22d365b6afa7"
}
Read Letter
curl http://localhost:3000/clinical/a701c3a94bf34a939d831d6183a78734/letters/20260125T125621.563Z-8d263432-d614-4d51-8611-22d365b6afa7
Response:
{
"body_content": "# Consultation Note\n\nPatient presented with hypertension.",
"rm_version": "1.0.4",
"composer_name": "Dr. Sarah Johnson",
"composer_role": "Clinician",
"start_time": "2026-01-25T12:56:21.563Z",
"clinical_lists": [...]
}
Initialise Coordination
curl -X POST http://localhost:3000/coordination \
-H 'Content-Type: application/json' \
-d '{
"clinical_uuid": "a701c3a94bf34a939d831d6183a78734",
"author": {
"name": "Dr. Brown",
"email": "brown@example.com",
"role": "Clinician",
"care_location": "City Hospital"
}
}'
Request/Response Formats
Author Object
All mutation endpoints accept an author object:
{
"author": {
"name": "Dr. John Smith",
"email": "john.smith@example.com",
"role": "Clinician",
"registrations": [
{
"authority": "GMC",
"number": "1234567"
}
],
"care_location": "City General Hospital",
"signature": "optional-pem-encoded-signature"
}
}
Error Responses
Errors return appropriate HTTP status codes with JSON error details:
{
"error": "Error message",
"details": "Additional context"
}
Common status codes:
200 OK- Success201 Created- Resource created400 Bad Request- Invalid input404 Not Found- Resource not found500 Internal Server Error- Server error
OpenAPI Specification
The OpenAPI specification is automatically generated from code annotations and available at:
http://localhost:3000/api-doc/openapi.json
Server Configuration
The REST server runs on port 3000 by default. Configuration via environment variables:
VPR_REST_ADDR- Server bind address (default:0.0.0.0:3000)RUST_LOG- Logging configuration
Implementation
The REST API is implemented in crates/api-rest/src/main.rs.
Key characteristics:
- Path parameter extraction - Uses axum
Pathextractor for UUIDs - JSON payloads - Uses axum
Jsonextractor for request bodies - Author construction - Helper function builds
Authorfrom JSON - Error handling - Maps errors to HTTP status codes
- OpenAPI annotations - Each handler annotated with
#[utoipa::path]
Comparison with gRPC API
| Feature | REST API | gRPC API |
|---|---|---|
| Protocol | HTTP/JSON | HTTP/2 + Protocol Buffers |
| Performance | Good | Excellent |
| Authentication | None (currently) | API key required |
| Type Safety | Runtime validation | Compile-time |
| Documentation | OpenAPI/Swagger | Protocol Buffer IDL |
| Binary Data | Base64 encoding | Native bytes |
| Streaming | Not supported | Supported |
Future Enhancements
Planned additions:
- Authentication and authorization
- Additional endpoints for messaging operations
- File upload support for letter attachments
- Pagination for list operations
- Filtering and search capabilities
Related Documentation
OpenEHR from database to file structure
Based on openEHR specifications, VPR organises clinical data into a structured file system that mirrors the openEHR Reference Model (RM) where practical.
OpenEHR EHR Status file
Below is an example of an EHR Status file in YAML format. The actual implementation may or may not include the other_details section depending on use case:
rm_version: rm_1_1_0
ehr_id:
value: 1166765a-406a-4552-ac9b-8e141931a3dc
archetype_node_id: openEHR-EHR-STATUS.ehr_status.v1
name:
value: EHR Status
subject:
external_ref:
id:
value: 2db695ed-7cc0-4fc9-9b08-e0c738069b71
namespace: vpr://mpi
type: PERSON
is_queryable: true
is_modifiable: true
Note: The other_details field is optional and only included when additional metadata is needed for a specific use case.
Communications
VPR Letters – Design and Rationale
Purpose
The VPR letters system provides a clinical, auditable, interoperable record of formal written correspondence related to patient care.
It is designed to:
- support cross-site and cross-system communication,
- remain human-readable without specialist software,
- withstand audit, legal, and regulatory review.
This document intentionally avoids imposing stylistic rules on how letters are written. Clinical correspondence varies widely by specialty, country, organisation, and individual clinician. VPR preserves this freedom while enabling safe reuse of selected clinical context.
Letters are version-controlled via git
Letters can be edited after creation, with all changes tracked through git version control. OpenEHR does not specify that letters must be closed to further edits.
This means:
- Every edit creates a new git commit,
- The full history of changes is preserved and auditable,
- Previous versions can be retrieved at any time,
- No data is ever lost or overwritten.
This provides both flexibility and a complete audit trail for clinical governance and patient safety.
File layout
Each letter is stored as a self-contained folder:
correspondence/
letter/
<letter-id>/
composition.yaml
body.md
attachments/
letter.pdf
This structure ensures that all artefacts related to a single letter are co-located, versioned, and auditable.
Letter identity
The <letter-id> is generated in the format:
YYYYMMDDTHHMMSS.sssZ-UUID
YYYYMMDDTHHMMSS.sssZ– timestamp with millisecond precisionUUID– random UUID v4 in RFC 4122 format (lowercase with hyphens)
Example:
20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000
This ensures letters are:
- globally unique,
- chronologically sortable within a patient record,
- safe for distributed, batch-based, and concurrent systems.
Timestamps provide chronology, not global ordering guarantees.
composition.yaml – OpenEHR composition
The composition.yaml file contains the OpenEHR-aligned COMPOSITION envelope for the letter.
It captures:
- identity
- authorship
- time context
- semantic intent
- structured, reusable clinical snapshots (optional)
Example composition.yaml
rm_version: "rm_1_1_0"
uid: "20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000"
archetype_node_id: "openEHR-EHR-COMPOSITION.correspondence.v1"
name:
value: "Clinical letter"
category:
value: "event"
composer:
name: "Dr Jane Smith"
role: "Clinical Practitioner"
context:
start_time: "2026-01-12T10:14:00Z"
content:
- section:
archetype_node_id: "openEHR-EHR-SECTION.correspondence.v1"
name:
value: "Correspondence"
items:
# Canonical narrative letter
- evaluation:
archetype_node_id: "openEHR-EHR-EVALUATION.clinical_correspondence.v1"
name:
value: "Clinical correspondence"
data:
narrative:
type: "external_text"
path: "./body.md"
# Optional reusable clinical lists (snapshots)
- evaluation:
archetype_node_id: "openEHR-EHR-EVALUATION.snapshot.v1"
name:
value: "Diagnoses (snapshot)"
data:
kind:
value: "diagnoses"
items:
- text: "Hypertension"
code:
terminology: "SNOMED-CT"
value: "38341003"
- text: "Hyperlipidaemia"
- text: "Chronic obstructive pulmonary disease"
code:
terminology: "SNOMED-CT"
value: "13645005"
- evaluation:
archetype_node_id: "openEHR-EHR-EVALUATION.snapshot.v1"
name:
value: "Medication summary (snapshot)"
data:
kind:
value: "medications"
items:
- text: "Amlodipine 10 mg once daily"
- text: "Atorvastatin 20 mg nocte"
Notes
openEHR-EHR-EVALUATION.snapshot.v1is a custom archetype, not a core OpenEHR entity.- This is intentional and aligned with OpenEHR practice.
- Snapshots are letter-scoped, time-bound clinical summaries, not canonical state. We call these
ClinicalLists.
Snapshot EVALUATION – design intent
The snapshot.v1 archetype is intentionally minimal and generic.
Its purpose is to support selective reuse of clinically relevant context without enforcing letter style or duplicating persistent records.
Snapshot properties
Each snapshot EVALUATION:
- represents one kind of reusable clinical context,
- is explicitly scoped to this letter only,
- may be copied forward by user choice,
- makes no claim of completeness or authority.
Minimal conceptual model
A snapshot contains:
kind– a semantic label identifying what this snapshot represents
(for example:diagnoses,medications,social_history,functional_status)items– zero or more entries- optional narrative text (when structure is insufficient)
The set of possible kind values is open-ended. VPR does not enforce an enum.
Unknown kinds are valid and must degrade gracefully.
Coded and uncoded items
Snapshot items may be:
- coded,
- uncoded, or
- mixed within the same snapshot.
Coding is optional and must never be required.
Example
items:
- text: "Hypertension"
code:
terminology: "SNOMED-CT"
value: "38341003"
- text: "Lives alone, independent"
This supports real-world clinical practice where:
- some concepts are well-coded,
- others are contextual or narrative,
- and forcing codes would lose meaning.
Relationship to persistent clinical lists
Snapshots are not persistent lists.
They answer a different question:
- Persistent list: “What do we currently believe is true?”
- Snapshot: “What did the author consider relevant for this letter at that time?”
Snapshots:
- may omit persistent items,
- may include provisional information,
- may differ between letters,
- must never automatically update canonical state.
Reconciliation occurs only through explicit clinical action and new COMPOSITIONs.
body.md – Canonical clinical letter
Purpose
body.md contains the canonical narrative letter.
It records:
- clinical prose only,
- written for human readers,
- editable after creation with full git version history.
It must not contain workflow, delivery, or coordination semantics.
Example body.md
Dear Dr Patel,
Thank you for seeing Mrs Jane Jones (DOB 12/04/1968) in the respiratory clinic today.
She reports an improvement in breathlessness since her last review. She confirms that she is currently taking amlodipine 10 mg once daily, rather than the previously documented dose of 5 mg.
We reviewed her medication list together. Atorvastatin was started during her recent admission. The intended dose is 20 mg nocte.
There are no new red flag symptoms. Examination today was unremarkable.
Plan:
- Continue amlodipine 10 mg once daily
- Continue atorvastatin 20 mg nocte
- Routine follow-up in six months
Kind regards,
Dr Jane Smith
Consultant Respiratory Physician
Example NHS Trust
Properties
- Editable after issue, with full git version history
- Human-readable Markdown
- Git-versioned with complete audit trail
- Suitable for audit, legal review, and patient access
Large binary artefacts
Large binary artefacts (for example PDFs with embedded images or scans) are stored using Git Large File Storage (Git LFS).
This means:
- a small pointer file is stored in the Git repository,
- binary content is stored in an external object store,
- pointers are versioned, immutable, and content-addressed.
From a clinical and audit perspective, these artefacts are first-class parts of the letter record.
Explicit non-features
The following are deliberately excluded from the letter model:
- read or opened status
- acknowledgements
- urgency markers
- task or workflow state
Letters represent clinical documentation, not behaviour or process.
VPR prioritises clarity, honesty, and auditability over convenience.
File layout
Each letter is stored as a self-contained folder:
correspondence/
letter/
<letter-id>/
composition.yaml
body.md
attachments/
letter.pdf
This structure ensures that all artefacts related to a single letter are co-located and auditable.
Letter identity
The <letter-id> is generated in the format YYYYMMDDTHHMMSS.sssZ-UUID:
YYYYMMDDTHHMMSS.sssZ– ISO 8601 timestamp with millisecond precisionUUID– Randomly generated UUID v4 in RFC 4122 format (lowercase with hyphens)- Example:
20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000
This ensures letters are:
- globally unique,
- chronologically sortable,
- safe for distributed and concurrent systems.
composition.yaml – OpenEHR composition
The composition.yaml file contains the OpenEHR composition representing the letter’s metadata and structure, as below:
rm_version: "1.0.4" # updatable via api
uid: "20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000" # updatable via api
archetype_node_id: "openEHR-EHR-COMPOSITION.correspondence.v1"
name:
value: "Clinical letter"
category:
value: "event"
composer:
name: "Dr Jane Smith" # updatable via api
role: "Consultant Physician" # updatable via api
context:
start_time: "2026-01-12T10:14:00Z" # updatable via api
content:
- section:
archetype_node_id: "openEHR-EHR-SECTION.correspondence.v1"
name:
value: "Correspondence"
items:
- evaluation:
archetype_node_id: "openEHR-EHR-EVALUATION.clinical_correspondence.v1"
name:
value: "Clinical correspondence"
data:
narrative:
type: "external_text"
path: "./body.md"
- evaluation:
archetype_node_id: "openEHR-EHR-EVALUATION.problem_summary.v1"
name:
value: "Diagnoses at time of correspondence"
data:
diagnoses:
- name: "Hypertension"
- name: "Hyperlipidaemia"
- name: "Chronic obstructive pulmonary disease"
NB: # updatable via api is placed to indicate fields that may be modified by the OpenEHR API.
body.md – Canonical clinical letter
Purpose
body.md is the canonical clinical representation of the letter.
It records:
- the full letter content only
An example of body.md might look like:
Dear Dr Patel,
Thank you for seeing Mrs Jane Jones (DOB 12/04/1968) in the respiratory clinic today.
She reports an improvement in breathlessness since her last review. She confirms that she is currently taking amlodipine 10 mg once daily, rather than the previously documented dose of 5 mg.
We reviewed her medication list together. Atorvastatin was started during her recent admission. The intended dose is 20 mg nocte.
There are no new red flag symptoms. Examination today was unremarkable.
Plan:
- Continue amlodipine 10 mg once daily
- Continue atorvastatin 20 mg nocte
- Routine follow-up in six months
Kind regards,
Dr Jane Smith
Consultant Respiratory Physician
Example NHS Trust
Properties
- Editable after issue, with full git version history
- Human-readable Markdown with front matter metadata
- Git-versioned with complete audit trail
- Suitable for audit, legal review, and patient access
Required structure (conceptual)
A letter SHOULD clearly contain:
- header information (author, organisation, date),
- recipient(s),
- subject or reason for correspondence,
- clinical narrative,
- actions or recommendations (if any),
- signature block.
The exact formatting is intentionally flexible to accommodate different clinical contexts.
Letter identity (internal)
Every letter MUST include a globally unique letter_id (RFC 4122 UUID with hyphens), recorded within the document.
Letter IDs exist to:
- unambiguously reference letters,
- allow later letters to reference earlier correspondence,
- support indexing and cross-system linkage.
Timestamps provide chronology, not identity.
Corrections and follow-up
Errors or clarifications may be handled either by:
- Editing the existing letter (with git tracking all changes), or
- Issuing a new letter that references the prior one via
references: <letter_id>.
Both approaches are valid. Git version control preserves an honest and legally defensible historical record of all changes.
Explicit non-features
letter.md does NOT record:
- read or opened status,
- acknowledgement,
- urgency markers,
- task or workflow state.
Letters represent communication, not behaviour.
comments.md
See Comments section for details.
Large binary artefacts
Large binary artefacts (for example PDFs with embedded images or scans) are stored using c.
In practice this means:
- a small pointer file is stored in the Git repository,
- the binary content is stored in a separate object store,
- the pointer is versioned, immutable, and content-addressed.
From a clinical and audit perspective, these artefacts are first-class parts of the letter record.
Design decisions explicitly rejected
The following were deliberately excluded:
- read receipts or confirmations,
- urgency flags,
- task or workflow semantics.
These features introduce legal ambiguity and false certainty.
VPR letters prioritise clarity, honesty, and auditability over convenience.
Demographics Repository
1. Purpose
The Demographics Repository is responsible for storing and managing patient identity and demographic information within VPR.
Its primary purpose is to provide a clear, authoritative, and interoperable representation of who the patient is, distinct from:
- what care they have received,
- what clinical observations have been recorded,
- and how care is coordinated.
Demographic data is foundational. Errors in demographics propagate risk across all other systems. For this reason, the Demographics Repository is deliberately separated from clinical and care coordination data.
2. Scope
The Demographics Repository contains identity and demographic information only. This includes, but is not limited to:
- Names and aliases
- Date of birth
- Sex and gender-related attributes
- Addresses and contact details
- Identifiers (NHS number, local identifiers)
- Deceased status
- Links to related persons where appropriate
It does not contain:
- clinical observations,
- diagnoses,
- procedures,
- correspondence content,
- care plans or workflows.
3. Use of FHIR
VPR uses FHIR (Fast Healthcare Interoperability Resources) as the canonical model for demographic data.
FHIR is used because it:
- is widely adopted across healthcare systems,
- has a clear and extensible Patient model,
- supports interoperability with existing NHS and international systems,
- cleanly separates identity from clinical content.
FHIR resources are stored and handled in a way that preserves their structure and semantics.
4. Primary FHIR Resource
4.1 Patient Resource
The core resource used in the Demographics Repository is the FHIR Patient resource.
The Patient resource represents:
- a single individual receiving or potentially receiving care,
- with zero or more identifiers,
- and zero or more contact and demographic attributes.
Only attributes relevant to identity and demographics are populated.
5. Separation from Clinical Repositories
The Demographics Repository is intentionally separate from the Clinical Repository.
Key reasons for this separation include:
- Demographic data changes more frequently and independently.
- Identity errors require different correction and governance processes.
- Many systems need demographic access without clinical access.
- Clinical data must not be invalidated by demographic corrections.
Clinical records reference patients by identifier rather than embedding demographic fields.
6. Corrections and Redactions
Demographic errors can have serious consequences.
When demographic information is determined to be incorrect or misattributed:
- Corrections are made by updating or superseding the relevant FHIR resource.
- Redacted demographic artefacts are moved to the Redaction Retention Repository (RRR).
- A reference remains to indicate that a correction has occurred.
Demographic information is never silently deleted.
7. Versioning and Change History
Demographic changes are expected and supported.
The Demographics Repository maintains:
- a full history of changes,
- attribution of who made each change,
- timestamps and reason codes where available.
This supports traceability, auditability, and patient safety.
8. Access and Authorisation
Access to demographic data is role-based and purpose-limited.
Different roles may have:
- read-only access,
- update access,
- linkage access for cross-system identity resolution.
Demographic access does not imply access to clinical content.
9. Relationship to Other VPR Components
- Clinical Repository: references patients by identifier only.
- Care Coordination Repository: links to patient identity without duplicating demographics.
- Redaction Retention Repository: stores superseded or misattributed demographic artefacts.
- External systems: demographic data may be exchanged using FHIR interfaces.
10. Design Principles
- Identity before care
- Correction without erasure
- Interoperability by default
- Clear separation of concerns
- Auditability without friction
11. Summary
The Demographics Repository provides a stable, interoperable, and auditable foundation for patient identity within VPR.
By using FHIR and maintaining strict separation from clinical and care coordination data, VPR ensures that identity errors can be corrected safely without compromising the integrity of the clinical record.
Care Coordination Repository
Overview
The Care Coordination Repository manages coordination state separate from clinical records.
It handles workflow coordination, cross-system state, and operational metadata that supports clinical care delivery without containing clinical content itself.
Repository Structure
The coordination repository follows the same sharded structure as clinical records:
patient_data/
coordination/
<s1>/
<s2>/
<uuid>/
.git/
COORDINATION_STATUS.yaml
communications/
<thread-id>/
messages.md
ledger.yaml
encounters/
...
appointments/
...
Root Status File
COORDINATION_STATUS.yaml
Each coordination repository includes a root status file that links it to the associated clinical record:
coordination_id: "7f4c2e9d-4b0a-4f3a-9a2c-0e9a6b5d1c88"
clinical_id: "a4f91c6d-3b2e-4c5f-9d7a-1e8b6c0a9f12"
status:
lifecycle_state: active # active | suspended | closed
record_open: true
record_queryable: true
record_modifiable: true
Purpose:
- Links coordination record to clinical record via
clinical_id - Tracks lifecycle state of the coordination repository
- Controls operational permissions (queryable, modifiable)
- Created during coordination repository initialization
Lifecycle states:
- active: Coordination record is operational and accepting updates
- suspended: Temporarily inactive (e.g., during data migration)
- closed: Permanently closed (e.g., patient deceased, record archived)
Properties:
- Mutable, overwriteable
- Git-versioned for audit trail
- Uses FHIR-aligned wire format for interoperability
- Validated against strict schema with UUID checks
Key Components
Messaging Coordination
Manages clinical communication threads between clinicians, patients, and authorized participants.
See Messaging Design for detailed specifications.
Encounter Management
Tracks patient encounters and episodes of care:
- Episode linkage and status
- Care team coordination
- Encounter documentation coordination
Appointment Coordination
Manages appointment scheduling and coordination:
- Cross-system availability
- Resource allocation
- Cancellation and rescheduling coordination
Design Principles
Separation of Concerns
Coordination data is strictly separated from clinical content:
- Clinical records (EHR): What happened, what was said, what was observed
- Coordination state: Who needs to know, what needs to be done, system state
Soft State
Coordination data is reconstructible and non-critical:
- Can be rebuilt from clinical records if lost
- Stale data causes inconvenience, not clinical harm
- Optimized for availability over consistency
Cross-System Coordination
Enables seamless care delivery across multiple systems:
- Shared state for care teams
- Consistent patient experience
- Reduced administrative overhead
Integration with VPR Components
Relationship to Clinical Repository
- Explicitly linked: Each coordination record has a
clinical_idin COORDINATION_STATUS.yaml - Initialization dependency: Coordination records require an existing clinical record UUID
- References not duplication: Does not duplicate clinical content
- Separation of concerns: Clinical facts vs. coordination state
- Enables coordination without coupling: Systems can coordinate without accessing clinical details
Relationship to Demographics
- Links coordination activities to patient identity
- Supports care team management
- Enables patient portal integration
API Integration
- REST and gRPC APIs provide coordination services
- Separate from clinical record APIs
- Optimized for coordination workflows
Lifecycle and Retention
Coordination data follows different retention policies than clinical records:
- Short-term retention: Active coordination state (weeks/months)
- Medium-term retention: Historical coordination for audit (years)
- Long-term retention: Minimal essential coordination metadata
Retention policies balance operational needs with privacy and storage costs.
Future Extensions
The coordination repository provides foundation for:
- Advanced workflow management: Task assignment, delegation tracking
- Multi-organisation coordination: Cross-provider care coordination
- Patient engagement: Portal integration, preference management
- Quality improvement: Workflow analytics, performance metrics
References
- VPR Architecture Overview
- Clinical Repository Design
- Messaging Design
- FHIR Integration
- API Specifications
Care Coordination Repository (CCR) Messaging – Design and Rationale
Purpose
The CCR messaging system provides a clinical, auditable, interoperable record of asynchronous communication between clinicians, patients, and other authorised participants.
It is designed to:
- support cross-site and cross-system care coordination,
- remain human-readable without specialist software,
- withstand audit, legal, and regulatory review,
- avoid asserting certainty about human behaviour that the system cannot honestly know.
Messaging in the Care Coordination Repository is treated as clinical communication, not as a transient chat feature.
Conceptually, CCR messaging is FHIR-aligned, using the semantics of the FHIR Communication resource as a guiding model, without adopting FHIR storage formats or server behaviour.
Conceptual model (FHIR-aligned)
Each CCR message corresponds conceptually to a FHIR Communication:
- it represents something that has already been communicated,
- it has an author, recipients, a timestamp, content, and a status,
- it is a clinical artefact with medico-legal weight.
CCR does not implement FHIR JSON, REST endpoints, or transport semantics. Instead, it preserves FHIR meaning while using a versioned, repository-based storage model aligned with the Versioned Patient Repository.
This guarantees that CCR messaging can be projected to FHIR Communication in future integrations, without constraining internal design.
Core principles
1. Messaging is clinical
Messages exchanged between clinicians, patients, and other healthcare participants carry clinical and medico-legal weight equivalent to:
- written advice,
- clinic letters,
- documented telephone or video consultations.
As such, CCR messages form part of the clinical coordination record.
2. Messages are immutable
Once recorded, messages:
- MUST NOT be edited,
- MUST NOT be deleted.
This mirrors paper records, professional guidance, and legal expectations.
Errors or clarifications are handled via corrections (addenda), never by modifying the original message.
3. Context matters more than individual messages
Individual messages often do not make sense in isolation.
For example:
“Yes, I will do that doctor”
only has meaning when read alongside preceding and subsequent messages.
For this reason, the conversation thread is the meaningful clinical unit, not the individual message.
This aligns with FHIR Communication, which is frequently contextualised by related communications, encounters, or care plans.
Repository placement
Messaging is a first-class concern of the Care Coordination Repository (CCR).
It sits alongside other coordination artefacts (for example, tasks or referrals added later), and is explicitly separated from:
- clinical facts (clinical repository),
- demographics and identity (demographics repository).
File layout
Each messaging thread is stored as:
coordination/
<shard1>/
<shard2>/
<coordination-id>/
COORDINATION_STATUS.yaml
communications/
<communication-id>/
messages.md → thread.md
ledger.yaml
The coordination repository is sharded by UUID for scalability, similar to clinical records.
Conceptually:
- A communication is a thread and a ledger file
- A thread is a list of messages stored in
thread.md - The ledger contains metadata such as participants, status, policies, and visibility settings
Where:
<communication-id>is a timestamp-prefixed UUID (e.g.,20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000)thread.mdcontains the canonical clinical conversation (list of messages)ledger.yamlcontains thread metadata (participants, status, policies, visibility)
Communication identity
The <communication-id> is generated using a timestamp-prefixed identifier:
- format:
YYYYMMDDTHHMMSS.sssZ-UUID - timestamp: UTC, ISO 8601, millisecond precision
- UUID: randomly generated
Example:
20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000
This ensures communication identifiers are:
- globally unique,
- chronologically sortable,
- suitable for distributed systems.
The existing TimestampId struct is used to generate and validate these identifiers.
thread.md – Thread of messages
Purpose
thread.md is the canonical clinical record of the conversation thread.
It records:
- what was communicated,
- by whom,
- when,
- and in what coordination context.
Conceptually, each entry corresponds to a FHIR Communication instance.
Properties
- Append-only
- Immutable once written
- Human-readable
- Git-versioned
- Suitable for audit and legal review
Message identity
Every message MUST include a globally unique message_id (UUID).
Message identifiers exist to:
- unambiguously identify messages,
- allow corrections to reference prior messages,
- support projections, caches, and alert suppression.
Timestamps are used for ordering, not identity.
Message types
messages.md may contain:
- clinician messages
- patient messages
- system messages
- correction messages
System messages (for example, “participant added to thread”) are first-class entries, as they provide clinically and legally relevant coordination context.
Corrections (addenda)
Errors or clarifications are recorded as new messages, not edits.
A correction message:
- is a new message,
- has its own
message_id, - references the original message via
corrects: <message_id>.
The original message is never modified.
This preserves a truthful, auditable historical record.
Explicit non-features
thread.md does NOT record:
- read or seen status,
- urgency flags,
- acknowledgement or acceptance,
- task completion or responsibility transfer.
These concepts imply human cognition or behaviour that the system cannot verify and therefore does not assert.
Example structure
# Messages
## Message
**ID:** `3f7a8d2c-1e9b-4a6d-9f2e-5c8b7a4d1f92`
**Type:** clinician
**Timestamp:** 2026-01-11T14:36:15.234Z
**Author ID:** `4f8c2a1d-9e3b-4a7c-8f1e-6b0d2c5a9f12`
**Author:** Dr Jane Smith
Patient has reported increasing shortness of breath.
Please review chest X-ray and advise on next steps.
---
## Message
**ID:** `8b2f6a5c-3d1e-4a9b-8c7f-6d5e4a3b2c1d`
**Type:** clinician
**Timestamp:** 2026-01-11T15:42:30.567Z
**Author ID:** `a1d3c5e7-f9b2-4680-b2d4-f6e8c0a9d1e3`
**Author:** Dr Tom Patel
Reviewed X-ray. No acute changes. Continue current management
and reassess in 48 hours. If symptoms worsen, arrange urgent review.
ledger.yaml – Thread context and policy
Purpose
ledger.yaml stores contextual and policy metadata, not clinical narrative.
It answers:
“Who is involved in this conversation, and under what rules?”
Typical contents
- participants and roles
- visibility and sensitivity flags
- thread status (open, closed, archived)
- organisational access rules
communication_id: 20260111T143522.045Z-550e8400-e29b-41d4-a716-446655440000
status: open
created_at: 2026-01-11T14:35:22.045Z
last_updated_at: 2026-01-11T15:10:04.912Z
participants:
- participant_id: 4f8c2a1d-9e3b-4a7c-8f1e-6b0d2c5a9f12
role: clinician
display_name: Dr Jane Smith
- participant_id: a1d3c5e7-f9b2-4680-b2d4-f6e8c0a9d1e3
role: clinician
display_name: Dr Tom Patel
- participant_id: 9b7c6d5e-4f3a-2b1c-0e8d-7f6a5b4c3d2e
role: patient
display_name: John Doe
visibility:
sensitivity: standard
restricted: false
policies:
allow_patient_participation: true
allow_external_organisations: true
Properties
- Mutable
- Overwriteable
- Git-audited
- Changes are deliberate and relatively infrequent
last_updated_atis automatically updated when messages are added
Thread-level metadata:
- Thread status: open, closed, or archived
- Participant list with roles (organisation field removed for simplicity)
- Visibility and sensitivity settings
- Participation policies (external organisations allowed by default)
Audit trail: Inherent in Git commit history and messages.md content - no separate audit section needed.
Explicit exclusions
ledger.yaml does NOT contain:
- message content,
- interaction or navigation state,
- user interface hints.
Git Versioning
All changes to coordination records are Git-versioned for audit purposes:
coordination:create: Created messaging thread
Care-Location: Oxford University Hospitals
coordination:update: Added message to thread
Care-Location: Oxford University Hospitals
coordination:update: Updated thread participant list
Care-Location: Oxford University Hospitals
Commits include:
- Structured commit messages with domain and action
- Care location metadata
- Optional cryptographic signatures
- Full audit trail of all changes
Alerting behaviour
CCR does not record:
- Read receipts or “seen” status
- Acknowledgements
- Urgency flags
- Task completion or responsibility transfer
These concepts imply human cognition or behaviour that the system cannot verify.
Consuming systems may implement alerting by:
- Tracking their own render/presentation state externally (not in VPR)
- Comparing message timestamps to their last-viewed records
- Presenting unread indicators in their user interface
This approach:
- Avoids false certainty about human understanding
- Reduces legal and clinical ambiguity
- Maintains truthful audit trails
- Enables consistent patient experience across systems
Alerting is a user-experience concern, not a clinical record.
Thread Lifecycle
Messaging threads follow a defined lifecycle:
Creation
Threads are created via CoordinationService::communication_create():
- Generates timestamp-prefixed communication ID
- Creates
communications/<communication-id>/directory - Writes initial
thread.md(optionally with first message) - Writes
ledger.yamlwith participant list and policies - Commits atomically to Git
Message Addition
Messages are added via CoordinationService::add_message():
- Generates unique message UUID
- Appends to
thread.md(preserves immutability) - Commits with structured message and care location
- Returns the message ID for reference
Metadata Updates
Thread metadata is updated via CoordinationService::update_communication_ledger():
- Modifies
ledger.yaml(participants, status, policies) - Git commit records the change
- Audit log tracks all modifications
Status Transitions
Threads can transition between states:
- Open → Closed: Thread completed, no new messages accepted
- Closed → Archived: Thread moved to archive, hidden from default views
- Open → Archived: Direct archival without closing
Deletion
Threads are never deleted:
- Immutability is preserved
- Audit trail remains complete
- Archival is used instead of deletion
- Git history retains full record
Implementation Details
Initialization
Coordination repositories are initialized with:
#![allow(unused)]
fn main() {
CoordinationService::new(cfg)
.initialise(author, care_location, clinical_id)
}
This creates:
- Sharded directory structure:
coordination/<s1>/<s2>/<uuid>/ COORDINATION_STATUS.yamlwith link to clinical record- Git repository with initial commit
- Lifecycle state set to
active
Thread Creation
Messaging threads are created with:
#![allow(unused)]
fn main() {
service.communication_create(
&author,
care_location,
participants,
initial_message
)
}
This:
- Generates timestamp-prefixed communication ID via
TimestampIdGenerator - Creates
communications/<communication-id>/directory - Writes
thread.mdwith optional initial message - Writes
ledger.yamlwith participant list and policies - Commits both files atomically to Git
Adding Messages
Messages are appended with:
#![allow(unused)]
fn main() {
service.add_message(
&author,
care_location,
thread_id,
message_content
)
}
This:
- Generates unique message UUID
- Appends to
thread.md(preserves immutability) - Commits with structured message and care location
- Returns the message ID
Type Safety
The CoordinationService uses type-state pattern:
CoordinationService<Uninitialised>- Can only callinitialise()CoordinationService<Initialised>- Can call thread and message operations
This prevents operations on non-existent repositories at compile time.
Error Handling
Operations return PatientResult<T> with comprehensive error types:
- Author validation errors
- Git operation failures
- File I/O errors
- FHIR wire format validation errors
- UUID parsing errors
Cleanup is attempted on initialization failure to prevent partial repositories.
Design decisions explicitly rejected
The following were deliberately excluded:
- read receipts (opening does not equal reading or understanding)
- urgency flags (asynchronous messaging is not suitable for urgent care)
- acknowledgement tracking (implies responsibility transfer)
- workflow or task semantics (these may be added later using FHIR-aligned Task concepts)
These exclusions reduce legal ambiguity, false certainty, and unintended clinical inference.
References
FHIR Integration
Overview
The coordination repository uses FHIR-aligned wire formats for interoperability without implementing FHIR JSON, REST endpoints, or transport semantics.
This approach:
- Preserves FHIR semantic meaning
- Uses repository-based storage model
- Enables future FHIR projections
- Maintains human-readable formats
- Provides strict schema validation
Coordination Status
Overview
The fhir::CoordinationStatus module handles parsing and rendering of COORDINATION_STATUS.yaml files.
API
#![allow(unused)]
fn main() {
// Parse from YAML
let status_data = fhir::CoordinationStatus::parse(yaml_text)?;
// Render to YAML
let yaml_text = fhir::CoordinationStatus::render(&status_data)?;
}
Domain Types
-
CoordinationStatusData- Top-level status structurecoordination_id: Uuid- Coordination repository identifierclinical_id: Uuid- Linked clinical record identifierstatus: StatusInfo- Status information
-
StatusInfo- Status detailslifecycle_state: LifecycleState- Current lifecycle staterecord_open: bool- Whether accepting new entriesrecord_queryable: bool- Whether queries are permittedrecord_modifiable: bool- Whether modifications are permitted
-
LifecycleState- EnumerationActive- Operational and accepting updatesSuspended- Temporarily inactiveClosed- Permanently closed
Validation
- UUID validation for
coordination_idandclinical_id - Enum validation for
lifecycle_state - Boolean validation for permission flags
- Strict schema with
deny_unknown_fields
Wire Format
Internal wire types use string UUIDs, translated to proper Uuid types at boundaries:
#![allow(unused)]
fn main() {
// Wire format (internal)
struct CoordinationStatusWire {
coordination_id: String,
clinical_id: String,
status: StatusWire,
}
// Domain format (public)
struct CoordinationStatusData {
coordination_id: Uuid,
clinical_id: Uuid,
status: StatusInfo,
}
}
Thread Ledgers
Overview
The fhir::Messaging module handles parsing and rendering of messaging thread ledger.yaml files.
This implementation uses FHIR Communication resource semantics without FHIR JSON transport.
API
#![allow(unused)]
fn main() {
// Parse from YAML
let ledger_data = fhir::Messaging::ledger_parse(yaml_text)?;
// Render to YAML
let yaml_text = fhir::Messaging::ledger_render(&ledger_data)?;
}
Domain Types
-
LedgerData- Top-level ledger structurethread_id: TimestampId- Thread identifierstatus: ThreadStatus- Thread statuscreated_at: DateTime<Utc>- Creation timestamplast_updated_at: DateTime<Utc>- Last update timestampparticipants: Vec<LedgerParticipant>- Participant listvisibility: LedgerVisibility- Visibility settingspolicies: LedgerPolicies- Participation policiesaudit: LedgerAudit- Change audit trail
-
ThreadStatus- EnumerationOpen- Active, accepting messagesClosed- Closed to new messagesArchived- Archived (hidden from default views)
-
LedgerParticipant- Participant informationparticipant_id: Uuid- Participant identifierrole: ParticipantRole- Participant roledisplay_name: String- Human-readable nameorganisation: Option<String>- Organization affiliation
-
ParticipantRole- EnumerationClinician- Clinical staff memberPatient- Patient participantCareTeam- Care team member or healthcare professionalSystem- System-generated participant
-
LedgerVisibility- Visibility settingssensitivity: String- Sensitivity level (standard, confidential, restricted)restricted: bool- Whether access is restricted beyond normal rules
-
LedgerPolicies- Participation policiesallow_patient_participation: bool- Patient participation permittedallow_external_organisations: bool- External organizations permitted
-
LedgerAudit- Audit trailcreated_by: String- Creator identifierchange_log: Vec<AuditChangeLog>- Chronological change log
-
AuditChangeLog- Single audit entrychanged_at: DateTime<Utc>- Change timestampchanged_by: String- Actor identifierdescription: String- Human-readable description
Validation
- UUID validation for
thread_id(asTimestampId) - UUID validation for all
participant_idfields - DateTime parsing with timezone handling
- Enum validation for
statusandrolefields - Strict schema with
deny_unknown_fields
Wire Format
Internal wire types separate concerns:
#![allow(unused)]
fn main() {
// Wire format (internal)
struct Ledger {
thread_id: String,
status: ThreadStatus,
created_at: DateTime<Utc>,
// ... string UUIDs, raw timestamps
}
// Domain format (public)
struct LedgerData {
thread_id: TimestampId,
status: ThreadStatus,
created_at: DateTime<Utc>,
// ... proper UUID types, validated timestamps
}
}
Translation happens at parse/render boundaries using internal helper functions.
Wire Format Principles
Separation of Concerns
- Wire types are internal implementation details
- Domain types are public API surface
- Translation happens at boundaries only
- Consumers work with domain types exclusively
Strict Validation
All wire formats use #[serde(deny_unknown_fields)]:
- Unknown fields are rejected
- Prevents silent schema drift
- Ensures forward compatibility is explicit
- Catches typos and configuration errors
Type Safety
- String identifiers validated and converted to proper types
- UUIDs parsed and validated at boundaries
- Timestamps validated and converted to
DateTime<Utc> - Enumerations validated against allowed values
Human-Readable Formats
YAML is used for all wire formats:
- Git-friendly diffs
- Human-readable without tooling
- Suitable for manual review
- Easy to debug and inspect
Error Handling
Parse errors use serde_path_to_error for detailed diagnostics:
Thread ledger schema mismatch at participants[0].role:
unknown variant `doctor`, expected one of
`clinician`, `patient`, `careteam`, `system`
This provides:
- Precise error location in document
- Clear error description
- Expected values for enumerations
- Actionable feedback for corrections
FHIR Alignment
Conceptual Model
VPR coordination uses FHIR resource semantics:
- COORDINATION_STATUS.yaml ≈ FHIR operational status tracking
- Thread ledger.yaml ≈ FHIR Communication metadata
- messages.md ≈ FHIR Communication content
This is conceptual alignment, not implementation:
- No FHIR JSON format
- No FHIR REST endpoints
- No FHIR server behavior
- No FHIR Bundle/Transaction semantics
Future Projections
FHIR-aligned wire formats enable future projections to:
- FHIR Communication resources - For messaging threads
- FHIR Task resources - For coordination tasks
- FHIR DocumentReference - For compositions
- FHIR RESTful APIs - For external integrations
Projection can happen:
- At API boundaries (gRPC/REST to FHIR)
- Via export tools (VPR to FHIR Bundle)
- Through ETL pipelines (VPR to FHIR data warehouse)
Semantic Preservation
Key FHIR concepts preserved:
- Communication.status →
ThreadStatus(open, closed, archived) - Communication.recipient →
participantswith roles - Communication.sender → author metadata in messages
- Communication.sent →
created_attimestamp - Communication.payload → message content in messages.md
This ensures:
- No semantic loss in translation
- Clear mapping to FHIR when needed
- Compatibility with FHIR-based systems
- Standards-based interoperability
Implementation Details
Module Structure
crates/fhir/src/
lib.rs # Public exports and error types
coordination_status.rs # COORDINATION_STATUS.yaml handling
messaging.rs # Thread ledger.yaml handling
Error Types
#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum FhirError {
InvalidInput(String),
InvalidYaml(serde_yaml::Error),
Translation(String),
InvalidUuid(String),
// ...
}
}
Errors are converted to PatientError at boundaries via From trait.
Testing
Each module includes comprehensive tests:
- Round-trip parsing (parse → render → parse)
- Schema validation (reject unknown fields)
- Type validation (reject wrong types)
- UUID validation (reject malformed UUIDs)
- Enum validation (reject unknown variants)
- Edge cases (minimal valid documents, optional fields)
Dependencies
serdeandserde_yaml- Serializationserde_path_to_error- Detailed error pathschrono- Timestamp handlinguuid- UUID typesvpr_uuid- TimestampId type
Usage Examples
Coordination Status
#![allow(unused)]
fn main() {
use fhir::{CoordinationStatus, CoordinationStatusData, StatusInfo, LifecycleState};
// Create new status
let status_data = CoordinationStatusData {
coordination_id: Uuid::new_v4(),
clinical_id: existing_clinical_uuid,
status: StatusInfo {
lifecycle_state: LifecycleState::Active,
record_open: true,
record_queryable: true,
record_modifiable: true,
},
};
// Render to YAML
let yaml = CoordinationStatus::render(&status_data)?;
// Write to file
fs::write("COORDINATION_STATUS.yaml", yaml)?;
// Later, parse back
let yaml_text = fs::read_to_string("COORDINATION_STATUS.yaml")?;
let parsed = CoordinationStatus::parse(&yaml_text)?;
assert_eq!(status_data, parsed);
}
Thread Ledger
#![allow(unused)]
fn main() {
use fhir::{Messaging, LedgerData, ThreadStatus, LedgerParticipant, ParticipantRole};
// Create ledger
let ledger_data = LedgerData {
thread_id: thread_id,
status: ThreadStatus::Open,
created_at: Utc::now(),
last_updated_at: Utc::now(),
participants: vec![
LedgerParticipant {
participant_id: clinician_uuid,
role: ParticipantRole::Clinician,
display_name: "Dr Jane Smith".to_string(),
organisation: Some("Example NHS Trust".to_string()),
},
],
// ... visibility, policies, audit
};
// Render to YAML
let yaml = Messaging::ledger_render(&ledger_data)?;
// Write to file
fs::write("ledger.yaml", yaml)?;
// Later, parse back
let yaml_text = fs::read_to_string("ledger.yaml")?;
let parsed = Messaging::ledger_parse(&yaml_text)?;
}
References
VPR File Storage (Binary and Non-Text Files)
Purpose
This document defines how non-text and binary files (for example PDFs, imaging, scans, waveforms, audio, and video) are stored, referenced, versioned, and governed within the Versioned Patient Repository (VPR).
The aim is to preserve clinical meaning, auditability, and long-term safety while remaining compatible with openEHR principles, offline use, and simple local operation (for example on a laptop), without introducing enterprise-only infrastructure.
Core Principles
- Clinical meaning and binary bytes are deliberately separated
- Binary files are not tracked in Git
- Binary files are immutable once added (new content creates a new file)
- References to files are explicit, auditable, and versioned
- Clinical repositories remain valid even when binary files are absent
- No global or cross-repository binary namespace exists
What Counts as a File
Files include, but are not limited to:
- Portable Document Format (PDF) documents
- Medical imaging (for example DICOM series)
- Scanned paper documents
- Audio or video recordings
- Physiological waveforms or monitoring exports
These files are treated as clinical material, but are not part of the primary structured clinical data.
Repository-Scoped Storage Model
VPR does not use a global binary store.
Instead, each repository is self-contained and stores its own associated files alongside its versioned content.
This document describes the pattern using the Clinical Repository (CR) as the example. The same pattern applies independently to other repositories (CCR, DR, RRR).
Clinical Repository Layout
For a single Clinical Repository:
clinical/
└── <clinical_id>/
├── .gitignore
├── compositions/
├── indexes/
├── metadata/
├── … other CR-specific content …
└── files/ # gitignored
Invariants
<clinical_id>/is the repository root and Git root- The CR is independently portable and versioned
files/is scoped only to this CRfiles/is explicitly excluded from Git tracking- The CR remains valid even if
files/is missing or incomplete
No patient identifier is implied or required by this structure.
The files/ Directory
The files/ directory:
- Contains binary files associated with this Clinical Repository
- May include documents, imaging, video, audio, or other binary formats
- Is not required to be present on all copies of the repository
- Is never authoritative for clinical meaning
The name files/ is intentionally neutral and does not imply format, size, or readability.
File Identity and Integrity
Each file is identified by its content, not by its filename.
VPR implements content-addressed storage using SHA-256 hashes:
- Files are stored using their SHA-256 hash as the filename
- Two-level sharding is used to prevent excessive files per directory
- Hashes are used to verify integrity
- If file contents change, a new file is created
Storage structure:
files/
└── sha256/
└── ab/ # First 2 characters of hash
└── cd/ # Next 2 characters of hash
└── abcdef123456... # Full hash as filename
File References in the Clinical Repository
Purpose of a File Reference
Clinical artefacts do not embed binary data.
Instead, they include file references which:
- Assert that a file exists or existed
- Describe the file’s clinical role
- Binds the reference immutably in time
File references are small, human-readable, and versioned as part of the CR.
Typical Reference Metadata
A file reference records:
- Relative path to the file within
files/ - Cryptographic hash (SHA-256)
- Hash algorithm identifier
- Media type (MIME type, best-effort detection)
- Original filename
- File size in bytes
- Storage timestamp (ISO 8601 format)
Example (matching FileMetadata structure):
file_reference:
hash_algorithm: sha256
hash: abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890
relative_path: files/sha256/ab/cd/abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890
size_bytes: 1048576
media_type: application/pdf
original_filename: discharge-letter.pdf
stored_at: "2026-01-24T10:30:00Z"
Note: The media_type is detected automatically using file content inspection and should not be considered authoritative for clinical purposes.
Placement Rules
File references are stored where the clinical meaning lives:
- Letters, reports, results → referenced from CR artefacts
- Workflow or administrative material → referenced from CCR artefacts
- Withdrawn or redacted material → referenced from RRR artefacts
The origin of the file (patient, clinician, external organisation) does not determine placement.
Clinical meaning does.
External and Patient-Provided Files
Patient-provided or externally received files follow a simple, explicit workflow:
- The file is placed into the CR’s
files/directory - A reference is created in an appropriate artefact
- A clinician may later incorporate or reinterpret the material
This mirrors real-world clinical practice (for example “patient brought letter – reviewed”).
Versioning Behaviour
- Files are immutable once added (enforced by the
FilesService) - New or corrected content results in a new file with a different hash
- References are append-only
- Historical references remain valid indefinitely
- Attempting to store a file with an existing hash returns an error
No reference is silently replaced or overwritten.
Redaction and Removal
VPR does not support silent deletion.
When a file must be withdrawn or redacted:
- The reference in CR is explicitly marked as withdrawn or redacted
- A tombstone remains in versioned history
- The file may be removed from
files/as a separate, explicit action
The system always retains evidence that the file once existed in the Redacted Retention Repository (RRR).
Why Git Large File Storage Is Not Used
Git Large File Storage (LFS) is not suitable because:
- It relies on repository paths rather than actual content identity
- It complicates offline and partial copies
- It does not align with openEHR-style separation of meaning and identity
Git is used to version clinical meaning, not binary bytes.
Enterprise Deployment and Acceleration (Non-Canonical Layer)
In enterprise deployments, VPR retains the on-disk Clinical Repository (CR) as the canonical source of truth, while performance, scale, and availability are achieved through derived acceleration layers. These include projection databases, indexes, and caches built by continuously reading the canonical CR and materialising fast read models for queries, lists, and search. Large files remain conceptually part of the CR but may be mirrored to object storage for durability and efficient delivery; such storage acts as a distribution and persistence layer, not a new authority. All enterprise components are explicitly rebuildable from the canonical repository, tolerate missing binary bytes, and never accept writes that bypass the CR. This preserves VPR’s laptop-first, openEHR-aligned philosophy while enabling high-throughput, low-latency operation at organisational scale.
Implementation
VPR provides the FilesService (in the vpr_files crate) for managing binary file storage:
Core Operations
-
add(source_path)— Adds a file to content-addressed storage- Computes SHA-256 hash
- Creates sharded storage path
- Enforces immutability (errors if hash exists)
- Detects media type automatically
- Returns
FileMetadatawith all reference information
-
read(hash)— Retrieves file contents by hash- Returns file as byte vector (
Vec<u8>) - Suitable for network transmission
- Errors if file not found
- Returns file as byte vector (
Service Characteristics
- Repository-scoped: Each service instance is bound to one repository
- Defensive: Validates all paths and prevents directory traversal
- Stateless: No persistent state beyond filesystem
- Safe: All paths canonicalised to prevent symlink attacks
See crates/files/src/files.rs for complete implementation details.
Summary
- Each repository stores its own files locally
- Files live in a
files/directory alongside versioned content - Files are not tracked by Git
- References are explicit, relative, and auditable
- Clinical meaning always lives in versioned artefacts
This design keeps VPR simple, portable, openEHR-aligned, and clinically honest.
Redaction Retention Repository (RRR)
1. Purpose
The Redaction Retention Repository (RRR) exists to ensure that patient-related information which has been removed from routine views is retained permanently, safely, and transparently.
RRR supports correctness, accountability, and trust in the VPR system by ensuring that no information is silently lost, while also ensuring that routine clinical, demographic, and care coordination views remain accurate and appropriate for day-to-day use.
2. Scope
The RRR applies to all patient-related information managed by VPR, including but not limited to:
- Clinical entries
- Demographic data
- Care coordination artefacts
- Referrals and correspondence
- Attachments and structured documents
RRR is not limited to clinical data and is not patient-owned.
3. Core Principles
3.1 Retention, Not Deletion
Information placed into the RRR is never deleted. Retention is the default and permanent state unless explicitly governed by external policy or law.
3.2 Removal from Routine View
Items in the RRR must not appear in routine clinical or operational workflows. Their removal prevents inappropriate use while preserving traceability.
3.3 Neutrality
Placement into the RRR does not imply error, blame, review, or wrongdoing. It reflects a change in suitability for routine display only.
3.4 Transparency and Auditability
All movements into the RRR are recorded, attributable, and inspectable by authorised roles.
4. What “Redaction” Means in VPR
In VPR, redaction means:
Removal of an artefact from routine views while preserving the artefact in full elsewhere.
Redaction does not mean:
- deletion,
- erasure,
- masking of content in-place.
Redaction is a relocation and reclassification operation.
5. Reasons for Redaction
Common reasons an artefact may be placed into the RRR include:
- Wrong patient association
- Misfiled demographic information
- Incorrect referral or care coordination entry
- Entered in error
- Consent withdrawal
- Jurisdictional or policy constraints
Reasons are recorded explicitly and separately from the artefact itself.
6. Relationship to Patient Repositories
When an artefact is redacted:
- The artefact is removed from the relevant patient repository’s routine view.
- A tombstone or pointer remains in the original location.
- The artefact is stored in the RRR with full context and metadata.
The patient repository remains clinically clean while retaining traceability.
7. Access and Authorisation
Access to the RRR is:
- Role-based
- Audited
- Intended for legitimate purposes such as governance, investigation, correction, or legal response
RRR access is expected and normal for authorised roles.
8. What the RRR Is Not
The RRR is not:
- A temporary holding area
- A review queue
- A punishment mechanism
- A hidden or secret store
- A patient-facing record
9. Lifecycle Overview
- Artefact created in a patient repository
- Determination made that artefact should not appear in routine view
- Redaction action performed
- Artefact placed into RRR
- Tombstone retained in original context
- Artefact remains retained indefinitely
10. Future Considerations
- Retention classes and policies
- Cross-referencing with corrected or re-associated artefacts
- Reporting and metrics on redaction activity
- External regulatory access models
11. Summary
The Redaction Retention Repository is a foundational component of VPR that ensures integrity, transparency, and long-term trust in patient records by separating routine use from permanent retention, without loss of information.
Concurrency and Correctness in VPR
Purpose
VPR is a file-based patient record system where each patient record is stored in its own Git repository (for example containing files such as ehr_status.yaml).
In a production deployment, multiple worker processes and multiple servers may handle requests concurrently. This document explains the simple, robust approach used by VPR to ensure:
- Only one update to a patient record happens at a time
- No updates are lost
- Git repositories are never left in an inconsistent or partially-written state
- The system remains safe across crashes and restarts
The design intentionally favours correctness and clarity over complexity.
Core Principle
For any given patient, only one writer is allowed at a time, and every write is checked before it is saved.
This is achieved using two layers:
- Per-patient serialisation (to decide whose turn it is to write)
- Optimistic concurrency checks at the Git layer (to prevent lost updates)
Layer 1: Per-Patient Serialisation
Problem
In a clustered environment, two workers may attempt to update the same patient record at the same time.
Solution
Before making any change, a worker must acquire a per-patient lock from a shared, trusted service (typically the main relational database).
- The lock is keyed by patient identifier
- Only one worker can hold the lock at a time
- Different patients can be updated in parallel
- If a worker crashes, the lock is automatically released
This guarantees that, at the system level, only one writer is active for a given patient at any moment.
Mental Model
- The database acts as a traffic light
- Green means “you may edit this patient now”
- Red means “wait or retry later”
Layer 2: Git-Based Optimistic Concurrency
Problem
Even with serialisation, extra protection is needed to ensure a write does not overwrite a newer version of the record.
Solution
Git already provides a perfect version check.
- When a worker reads a patient repository, it records the current commit hash
- When pushing an update, the worker asserts that the repository is still at that commit
- If the repository has moved on, the push is rejected
This prevents:
- Lost updates
- Silent overwrites
- Inconsistent repository state
Mental Model
“Only save my changes if nothing has changed since I last looked.”
End-to-End Write Flow
For a single patient update, the system follows this sequence:
- Acquire the per-patient lock from the shared database
- Read the patient Git repository and record the current commit hash
- Apply changes locally in an isolated working copy
- Create a Git commit containing the update
- Push the commit, asserting the expected previous commit hash
- Release the per-patient lock
If any step fails, the operation is retried or aborted safely without corrupting the patient record.
Failure and Crash Safety
The system is designed so that failures are safe by default.
- If a worker crashes before pushing, the repository is unchanged
- If a worker crashes after pushing, the change is already complete
- Locks are not permanent and are released automatically
- Git guarantees atomic updates of repository state
No manual intervention is required to recover from partial failures.
What This Design Guarantees
- Exactly one writer per patient at any given time
- No lost or overwritten updates
- No partially-written files
- Safe operation across multiple machines
- Simple, auditable behaviour
What This Design Intentionally Avoids
At the current scale, VPR does not require:
- Distributed consensus systems
- Message queues for write coordination
- Shared filesystem locks
- Complex conflict resolution logic
These may be introduced later if throughput demands increase, but are not necessary for correctness.
Summary
VPR ensures correctness by combining:
- A shared per-patient lock to serialise writes
- Git commit checks to prevent overwriting newer data
This approach is intentionally boring, und
Git versioning and commit signatures
VPR stores each patient record as files on disk, and uses a Git repository per patient directory to version changes. This enables history, diffs, and (optionally) cryptographic signing of commits.
Immutability and Audit Trail Philosophy
Core Principle: Nothing is Ever Deleted
VPR maintains a completely immutable audit trail. Nothing is ever truly deleted from the version control history. This fundamental design choice ensures:
- Patient Safety: Every change is traceable to a specific author at a specific time
- Legal Compliance: Complete audit trail meets regulatory requirements
- Clinical Governance: Full accountability for all modifications
- Research and Quality: Historical data remains available for authorized retrospective analysis
Commit Actions and Their Meaning
VPR uses a controlled vocabulary for commit actions, each with specific semantics:
Create
Used when adding new content to an existing record. Examples:
- Creating a new clinical letter
- Adding a new observation
- Recording a new diagnosis
- Initializing a new patient record
This is the most common action for new data entry.
Update
Used when modifying existing content. Examples:
- Correcting a typo in a letter
- Updating patient demographics (address change, name change)
- Linking demographics to clinical records
- Amending administrative details
The previous version remains in Git history and can be compared via diff.
Superseded
Used when newer clinical information makes previous content obsolete. Examples:
- A revised diagnosis based on new test results
- An updated care plan
- Replacement of preliminary findings with final results
This is distinct from Update as it represents a clinical decision that previous information
should be replaced rather than corrected. The superseded content remains in history but is
marked as no longer current for clinical decision-making.
Redact
Used when data was entered into the wrong patient’s repository by mistake. This can occur in any of the three repositories: clinical, demographics, or coordination. This is the only action that removes data from active view. The process:
- Data is removed from the patient’s active record
- Data is encrypted and moved to the Redaction Retention Repository
- A non-human-readable tombstone/pointer remains in the Git history
- The commit message records the redaction action for audit purposes
Even redacted data is preserved in secure storage and remains accessible to authorized auditors, ensuring complete traceability while protecting patient privacy.
What This Means in Practice
- Every change is preserved: Git commits form an unbroken chain from initialization to present
- Diffs show what changed: You can compare any two points in time
- Authors are accountable: Each commit is signed (optionally cryptographically) with author metadata
- No data loss: Even mistakes are preserved in history, allowing forensic analysis if needed
- Audit compliance: Regulators can verify that no data has been improperly deleted
Where Git repos live
Clinical records are stored under the sharded directory structure:
patient_data/clinical/<s1>/<s2>/<32-hex-uuid>/
That patient directory is initialised as a Git repository (.git/ lives inside it).
Initial commit creation
When a new clinical record is created:
- VPR copies the
clinical-template/directory into the patient directory. - VPR writes the initial
ehr_status.yaml. - VPR stages all files (excluding
.git/) and writes a tree. - VPR creates the initial commit.
The implementation lives in crates/core/src/clinical.rs in ClinicalService::initialise.
Branch behaviour (main)
Signed commits are created with git2::Repository::commit_signed. A key detail of libgit2 is:
commit_signedcreates the commit object but does not update any refs (no branch ref, noHEADupdate).
To ensure the repo behaves like a normal Git repo, VPR explicitly:
- sets
HEADtorefs/heads/mainbefore creating the first commit, and - after the signed commit is created, creates/updates
refs/heads/mainto point at that commit and pointsHEADto it.
Result: clinical repos “land on” the main branch.
How signing works
If Author.signature is provided during initialisation, VPR signs the initial commit using ECDSA P-256.
- Payload: the unsigned commit buffer produced by
Repository::commit_create_buffer.- This is the exact byte payload that must be signed to match what
commit_signedexpects.
- This is the exact byte payload that must be signed to match what
- Algorithm: ECDSA over P-256 (
p256crate). - Signature encoding:
- VPR uses the raw 64-byte signature format (
r || s) and base64-encodes it. - This base64 string is passed to
commit_signedand ends up stored in the commit header fieldgpgsig.
- VPR uses the raw 64-byte signature format (
Notes:
- Despite the
gpgsigname, this is not a GPG signature; it is an ECDSA signature stored in that header field. - VPR currently focuses on “is this commit cryptographically valid for this key?”, not on GPG identity chains.
How verification works
VPR can verify that a commit was signed by the private key corresponding to a provided public key.
Verification steps (implemented in ClinicalService::verify_commit_signature):
- Open the patient Git repo.
- Resolve the latest commit from
HEAD. - Read the
gpgsigheader field from the commit. - Normalise it (handle whitespace wrapping), base64-decode it, and parse as a P-256 ECDSA signature.
- Recreate the unsigned commit buffer with
commit_create_bufferusing the commit’s tree/parents/author/committer/message. - Verify the signature over that recreated buffer using the provided public key.
Important behaviour:
- Verification currently requires a valid
HEAD(it does not attempt to recover commits from an unborn branch). - The verifier accepts either:
- a PEM-encoded public key, or
- a PEM-encoded X.509 certificate (
.crt), in which case the EC public key is extracted and used.
CLI usage
The CLI exposes verification as:
vpr verify-clinical-commit-signature <clinical_uuid> <public_key_or_cert>
Examples:
vpr verify-clinical-commit-signature 572ae9ebde8c480ba20b359f82f6c2e7 dr_smith.crtvpr verify-clinical-commit-signature 572ae9ebde8c480ba20b359f82f6c2e7 ./dr_smith_public_key.pem
What this does (and does not) prove
This verification proves:
- the commit’s signature is mathematically valid for the provided public key, over the exact commit payload VPR signs.
It does not (by itself) prove:
- that a certificate is trusted (no chain/CA validation),
- that the author identity is “real” (it’s still a local signature check).
Comments
VPR - Versioned Patient Repository
Install pre-commit hooks
pre-commit install
install rust locally if you want to test on local machine
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
start a new terminal to be able to use rust
Install protobuf compiler
brew install protobuf
Build
cargo build
Nuke Docs
As the docs run on a cache, you will likely need to nuke the docs if you remove files. Just manually run nuke docs cache (manual) from GitHub Actions.
Future: Database Projections
Note: The following database benchmarks and setup instructions are for planned future implementation of database projections (Postgres) and caching (Redis) for performance optimisation. The current system uses file-based storage only.
Time trial benchmarks
Preliminary benchmarks comparing Postgres vs Git for single entry operations:
- Postgres: 22.45 ops/sec
- Git: 8.11 ops/sec
Postgres is approximately 3 times faster for these operations.
Postgres setup (for future implementation)
brew install hyperfine
brew install postgresql@16
brew services start postgresql@16
PGURL="postgres://user:pass@localhost:5432/postgres" N=10000 ./file_db_time_trial.sh
createuser -s postgres || true
psql -U postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';" || true
Test VPR server
With server reflection enabled (set VPR_ENABLE_REFLECTION=true), you can use:
grpcurl -plaintext -d '{}' localhost:50051 vpr.v1.VPR/Health
To get a reflection of the service:
grpcurl -plaintext localhost:50051 describe vpr.v1.VPR
You can check out endpoints specifics like this:
grpcurl -plaintext localhost:50051 describe .vpr.v1.CreatePatientReq
Or with the proto file (without reflection):
grpcurl -plaintext \
-import-path crates/api/proto \
-proto crates/api/proto/vpr/v1/vpr.proto \
-d '{}' \
localhost:50051 vpr.v1.VPR/Health
Note: Server reflection is disabled by default for security in production. Set VPR_ENABLE_REFLECTION=true to enable it.
CLI
The VPR command-line interface (CLI) provides comprehensive tools for managing patient records, including demographics, clinical data, and care coordination.
Usage
Inside the ‘vpr-dev’ Docker container or after building the vpr-cli crate:
vpr --help
Available Commands
Patient Management
list- Lists all patients in the systeminitialise-full-record- Creates a complete patient record (demographics, clinical, and coordination repositories)
Demographics
initialise-demographics- Initialises a new demographics repositoryupdate-demographics- Updates demographic information (given names, last name, birth date)
Clinical Records
initialise-clinical- Initialises a new clinical repositorywrite-ehr-status- Links clinical repository to demographics by writing EHR status filenew-letter- Creates a new clinical letter with markdown contentnew-letter-with-attachments- Creates a new letter with file attachmentsread-letter- Reads and displays a clinical letterget-letter-attachments- Retrieves attachments for a letter
Care Coordination
initialise-coordination- Initialises a new coordination repository linked to clinical recordcreate-thread- Creates a new messaging threadadd-message- Adds a message to an existing threadread-communication- Reads a communication thread with all messagesupdate-communication-ledger- Updates ledger (participants, status, visibility)update-coordination-status- Updates lifecycle status and flags
Security
create-certificate- Creates a professional registration certificate with X.509 encodingverify-clinical-commit-signature- Verifies cryptographic signature on latest clinical commit
Development
delete-all-data- DEV ONLY: Deletes all patient data (requiresDEV_ENV=true)
Common Options
Author Registration
Many commands support professional registrations using the --registration flag, which can be repeated:
--registration "GMC" "1234567" --registration "NMC" "98765"
Digital Signatures
Commands that modify records support optional digital signatures using the --signature flag:
--signature <ecdsa_private_key_pem>
The signature can be provided as PEM text, base64-encoded PEM, or a file path.
Example Workflows
Creating a Complete Patient Record
# 1. Create full record
vpr initialise-full-record "Emily" "Davis" "1985-03-20" \
"Dr. Robert Brown" "robert.brown@example.com" "Clinician" "City Hospital"
# Outputs: Demographics UUID, Clinical UUID, Coordination UUID
Adding a Letter
vpr new-letter <clinical_uuid> "Dr. Sarah Johnson" "sarah.johnson@example.com" \
--role "Clinician" \
--care-location "GP Clinic" \
--content "# Clinical Note\n\nPatient assessment..."
Adding a Letter with Attachments
vpr new-letter-with-attachments <clinical_uuid> \
"Dr. Michael Chen" "michael.chen@example.com" \
--role "Clinician" \
--care-location "Hospital Laboratory" \
--attachment-file "/path/to/lab_results.pdf"
Creating a Communication Thread
vpr create-thread <coordination_uuid> "Dr. Brown" "brown@example.com" \
--role "Clinician" \
--care-location "City Hospital" \
--participant "<clinical_uuid>" "clinician" "Dr. Brown" \
--participant "<demographics_uuid>" "patient" "Emily Davis" \
--initial-message "Initial consultation scheduled."
Adding Messages to a Thread
vpr add-message <coordination_uuid> <thread_id> \
"Nurse Wilson" "wilson@example.com" \
--role "Clinician" \
--care-location "City Hospital" \
--message-type "clinician" \
--message-body "Patient vitals recorded." \
--message-author-id "<clinician_uuid>" \
--message-author-name "Nurse Wilson"
Getting Help
For detailed help on any command:
vpr <command> --help
Large Language Model (LLM) Documentation Index
VPR — AI contributor notes
These notes are for automated coding agents and should be short, concrete, and codebase-specific.
Specifications live in spec.md; roadmap is tracked in roadmap.md. Keep this document consistent with those sources.
Overview
- Purpose: VPR is a file-based patient record system with Git-like versioning, built as a Rust Cargo workspace. It provides dual gRPC and REST APIs for health checks and patient creation. The system stores patient data as JSON/YAML files in a sharded directory structure under
patient_data/, with each patient having their own Git repositories (clinical, demographics, and coordination) for version control. - Key crates:
crates/core(vpr-core) — PURE DATA OPERATIONS ONLY: File/folder management, patient data CRUD, Git versioning with X.509 commit signing. NO API concerns (authentication, HTTP/gRPC servers, service interfaces).crates/api-shared— Shared utilities and definitions for both APIs: Protobuf types, HealthService, authentication utilities.crates/api-grpc— gRPC-specific implementation: VprService, authentication interceptors, tonic integration.crates/api-rest— REST-specific implementation: HTTP endpoints, OpenAPI/Swagger, axum integration.crates/certificates(vpr-certificates) — Digital certificate generation utilities: X.509 certificate creation for user authentication and commit signing.crates/cli(vpr-cli) — Command-line interface: CLI tools for patient record management and certificate generation.
- Main binary:
vpr-run(defined in rootCargo.toml), runs both gRPC (port 50051) and REST (port 3000) servers concurrently using tokio::join.
Important files to reference
src/main.rs— Main binary that performs startup validation (checks for patient_data, clinical-template directories; creates clinical/demographics/coordination subdirs), creates runtime constants, and starts both gRPC (port 50051) and REST (port 3000) servers concurrently using tokio::join.crates/core/src/lib.rs— PURE DATA OPERATIONS: Services for file/folder operations (sharded storage, directory traversal, Git repos per patient). NO API CODE.crates/core/src/config.rs—CoreConfigand helpers used to resolve/validate configuration once at startup.crates/core/src/clinical.rs— ClinicalService: Initialises patients with clinical template copy, creates Git repo, signs commits with X.509.crates/core/src/demographics.rs— DemographicsService: Updates patient demographics JSON, lists patients via directory traversal.crates/api-grpc/src/service.rs— gRPC service implementation (VprService) with authentication, using core services.crates/api-shared/vpr.proto— Canonical protobuf definitions for VPR service (note: national_id field present but unused in current impl).crates/api-shared/src/health.rs— Shared HealthService used by both gRPC and REST APIs.Justfile— Developer commands:just start-dev(Docker dev),just docs(mdBook site),just pre-commit.compose.dev.yml— Development Docker setup with cargo-watch live reload and healthcheck (grpcurl -plaintext localhost:50051 list && curl -f http://localhost:3000/health).scripts/check-all.sh— Quality checks:cargo fmt --check,cargo clippy -D warnings,cargo check,cargo test.docs/src/overview.md— Detailed project overview and architecture.
Build and test workflows (concrete)
- Local quick compile:
cargo build -p api-grpc(orcargo runfor full binary). - Full workspace checks:
./scripts/check-all.sh(runs fmt, clippy, check, test). - Docker dev runtime:
just start-devordocker compose -f compose.dev.yml up --build. - Healthcheck:
grpcurl -plaintext localhost:50051 list(gRPC) andcurl http://localhost:3000/health(REST). - Documentation:
just docsserves mdBook site with integrated rustdoc.
Conventions and patterns to follow
- Protobufs: Canonical proto in
crates/api-shared/vpr.proto; generated Rust inapi_shared::pbvia build script. - Service wiring:
crates/api-grpcimplementsVprServiceusing core services; binaries construct it viaVprService::new(Arc<CoreConfig>). - Patient storage: Sharded under the configured patient data directory (default:
patient_data):- Clinical:
clinical/<s1>/<s2>/<32hex-uuid>/(ehr_status.yaml, copied clinical-template files, Git repo) - Demographics:
demographics/<s1>/<s2>/<32hex-uuid>/patient.json(FHIR-like JSON, Git repo) - Coordination:
coordination/<s1>/<s2>/<32hex-uuid>/(Care Coordination Repository: encounters, appointments, episodes, referrals; Git repo) where s1/s2 are first 4 hex chars of UUID.
- Clinical:
- APIs: Dual gRPC/REST with identical functionality; REST uses axum, utoipa for OpenAPI.
- Logging:
tracingwithRUST_LOGenv var (e.g.,vpr=debug). - Error handling: tonic
Statusfor gRPC, axumStatusCodefor REST; internal errors logged withtracing::error!. - File I/O: Direct
std::fsoperations withserde_json/serde_yamlfor patient data; no database layer. - Git versioning: Each patient directory is a Git repo; commits signed with X.509 certificates from author.signature.
- Clinical template:
templates/clinical/directory copied to new patient clinical dirs; validated at startup.
- Clinical template:
Runtime configuration and environment variables
- Resolve environment variables once at process startup (or CLI startup) and pass configuration down.
- Create a
vpr_core::CoreConfig(seecrates/core/src/config.rs) in the binary entrypoints:src/main.rs(vpr-run)crates/api-grpc/src/main.rs(standalone gRPC)crates/api-rest/src/main.rs(standalone REST)crates/cli/src/main.rs(CLI)
- Typical env inputs:
PATIENT_DATA_DIR,VPR_CLINICAL_TEMPLATE_DIR,RM_SYSTEM_VERSION,VPR_NAMESPACE. - Use the helpers in
crates/core/src/config.rsto resolve/validate template and parse the RM version.
- Create a
crates/core(vpr-core) must not read environment variables during operations.- Do not call
std::env::varin core service methods or helpers. - Prefer constructors like
ClinicalService::new(Arc<CoreConfig>)for uninitialised state, orClinicalService::with_id(Arc<CoreConfig>, Uuid)for initialised state. Same forDemographicsService::new(Arc<CoreConfig>). - This avoids rare-but-real process-wide env races and keeps behaviour consistent within a request.
- Do not call
Defensive programming (clinical safety)
- Treat defensive programming as a non-negotiable requirement.
- Validate inputs and configuration early and fail fast (arguments, resolved startup configuration, parsed identifiers) before doing filesystem/Git side effects.
- Prefer bounded work over unbounded behaviour (retry limits, traversal depth, file counts/sizes, timeouts where applicable).
- Avoid silent fallbacks and “best effort” behaviour in core logic: return a typed error when something is invalid.
- Avoid
panic!/expect()on paths influenced by inputs or environment; reserve them for internal invariants only. - When partial work has occurred, attempt cleanup/rollback and do not ignore cleanup failures.- Strong static typing: Leverage Rust’s type system to encode invariants and prevent errors at compile time. Use wrapper types to represent validated data (e.g.,
ShardableUuidfor canonical UUIDs,Authorfor validated commit authors). Avoid stringly-typed data, primitive obsession, and runtime checks where types can express constraints. Prefer newtype patterns and distinct types over raw strings, integers, or booleans when domain concepts have specific rules.- Formatting: All Rust code MUST followcargo fmtstandards. Before completing any changes, runcargo fmton the workspace. Do not commit code that failscargo fmt --check. The project usesrustfmt.tomlfor consistent formatting enforced by pre-commit hooks. - Spelling: Use British English (en-GB) for documentation and other prose (mdBook pages, README, Rustdoc/comments).
- Documentation style:
- Use Rustdoc (doc comments) with standard section headings.
- For functions/methods (including private helpers), include clear
# Arguments,# Returns, and# Errorssections when applicable.- Include
# Argumentsfor all methods with parameters (public or private), documenting what each parameter represents. - Include
# Returnsfor all methods that return non-unit values (public or private), describing what is returned. - If there are no arguments/meaningful return value/no error conditions to document, omit the empty section.
- For
# Errors, prefer a short, grouped bullet list describing the conditions under which an error is returned (not an exhaustive list of enum variants).- Use the form:
Returns <ErrorType> if:then- ...bullets. - Group by category when helpful (validation/config, filesystem I/O, serialisation, Git, crypto).
- Use the form:
- Include
- For each module, start the file with
//!module-level Rustdoc that outlines what the module does and what it is intended to do. - Documentation examples: In Rust, documentation examples are executable doctests and should be used deliberately, not everywhere by default. Examples are encouraged when they clarify lifecycle rules, state transitions, ordering constraints, or non-obvious correct usage, as they act as part of the correctness and safety contract of the code. Avoid adding examples to trivial helpers or internal plumbing where the signature is self-explanatory. Prefer a small number of minimal, focused examples that encode important invariants rather than repetitive or decorative usage snippets.
- Imports and naming:
- Prefer adding clear
useimports (for example,use crate::uuid::ShardableUuid;) rather than repeating long paths likecrate::...throughout the file. - Prefer calling imported items directly (e.g.
copy_dir_recursive(...)) instead of qualifying call sites withcrate::copy_dir_recursive(...).- Exception: keep fully-qualified paths only when needed to disambiguate names.
- For constants, prefer importing the specific items by name (for example
use crate::constants::{EHR_STATUS_FILENAME, LATEST_RM};) so call sites don’t needconstants::...prefixes. - Avoid glob imports (
use crate::foo::*;) unless there is a strong reason. - Keep imports scoped to what the file uses; remove unused imports to satisfy clippy
-D warnings. - If two imports would conflict, use explicit renaming (
use crate::thing::Type as ThingType;) rather than falling back to fully-qualified paths everywhere.
- Prefer adding clear
- Architecture boundaries:
core: ONLY file/folder/git operations (ClinicalService, DemographicsService, data persistence)api-shared: Shared API utilities (HealthService, auth, protobuf types)api-grpc: gRPC-specific concerns (service implementation, interceptors)api-rest: REST-specific concerns (HTTP endpoints, JSON handling)main.rs: Startup validation (patient_data, clinical-template dirs), runtime constants, service orchestration
Testing boundaries
- Test where the rule lives:
- If a function implements validation rules (for example
Author::validate_commit_author), write exhaustive unit tests for each failure mode and a success case. - If a function merely calls validation (for example
ClinicalService::initialisecallingauthor.validate_commit_author()?), write only wiring tests:- validation errors are returned unchanged,
- no side effects occur when validation fails.
- If a function implements validation rules (for example
- Prefer true unit tests (no filesystem/Git/network) where possible; use TempDir-backed tests only for integration-level behaviour (directory layout, Git repo creation, template copying).
Change policy and safety
- Prefer minimal, well-scoped PRs updating single crates or modules.
- Run
./scripts/check-all.shbefore proposing changes; fix clippy warnings. - When changing protos: Update
crates/api-shared/vpr.proto, regenerate withcargo build. - Patient data paths: Hardcoded sharding logic in
core; avoid changing without testing directory traversal inlist_patients. - Environment config: Env vars are read in binaries/CLI at startup to build
CoreConfig; avoid adding env reads tocrates/core. - Proto fields: Some fields (e.g., national_id) present but unused in current implementation.
Examples (copyable snippets)
- Start dev servers:
just start-dev b(builds and runs Docker containers). - Health check:
grpcurl -plaintext localhost:50051 vpr.v1.VPR/Healthorcurl http://localhost:3000/health. - Create patient:
grpcurl -plaintext -d '{"first_name":"John","last_name":"Doe"}' localhost:50051 vpr.v1.VPR/CreatePatient. - List patients:
grpcurl -plaintext localhost:50051 vpr.v1.VPR/ListPatientsorcurl http://localhost:3000/patients.
Edge cases for automated edits
- Do not change workspace members in root
Cargo.tomlwithout verifying all crates build. - Avoid altering patient directory sharding in
core/src/lib.rs—list_patientsrelies on exact structure. - Main.rs runs both servers; changes must maintain concurrency (tokio::join).
- Docker mounts
./patient_datafor persistence; test with actual file creation/deletion. - Clinical template validation:
clinical-template/must exist and contain files; clinical init copies it recursively.
If unsure, ask for clarification and provide a short plan: files to change, tests to add, and commands you will run to validate.
If you’d like I can expand any section (e.g., CI, proto build details, or example PR checklist).
LLM Specification (Draft)
Purpose and Scope
- Define how LLM tooling supports the VPR project while respecting safety, auditability, and architecture boundaries.
- Focus on assistant-driven code/docs changes and developer workflows; avoid introducing runtime LLM features.
- Keep this spec aligned with docs/src/llm/copilot-instructions.md (canonical guidance for AI contributors).
System Context
- VPR is a Rust Cargo workspace delivering dual gRPC/REST services plus a CLI over a file-based, Git-versioned patient record store.
- Core data operations live in
crates/core; transports live incrates/api-grpcandcrates/api-rest; shared proto/auth/health incrates/api-shared; certificate utilities incrates/certificates; CLI incrates/cli. - Patient data is stored on disk, sharded by UUID under
patient_data/, with separate clinical, demographics, and coordination repos per patient and Git history for audit. - Future separation: we may split the core VPR code into its own library crate; it must remain independent of organisational layers (security, APIs, enterprise back-office). Core must not depend on organisational code, but organisational layers may depend on core.
Patient-Centred Posture
- Put the patient at the centre of every decision: safety, clarity, and agency outweigh convenience.
- Treat the combination of patient-first intent and human-readable files as the keystone: files remain the canonical, inspectable record that patients and clinicians can understand and carry.
- Support two deployment shapes: (a) patient/self-hosted mode on a personal machine using CLI and a simple UX (to be built later in the epics), and (b) enterprise/organisation mode serving hundreds or thousands of patients.
- Design interfaces and logging with the assumption that patients may access their own records; avoid leaking PHI in tooling output while keeping auditability for clinical and organisational users.
- Keep on-disk formats human-readable (YAML and Markdown with front matter) so patients and clinicians can inspect history; use JSON only for internet-facing APIs (REST/gRPC) where required.
LLM Responsibilities (assistant mode)
- Follow canonical contributor instructions: defensive programming, British English docs, architecture boundaries, startup config resolution in binaries only.
- Generate scoped changes with clear rationale, minimal blast radius, and accompanying tests when behaviour changes.
- Keep docs consistent across mdBook sources (
docs/src/**) and README; prefer linking to canonical sources instead of duplicating.
Data and Storage Invariants
- Sharded directories:
patient_data/clinical/<s1>/<s2>/<uuid>/,patient_data/demographics/<s1>/<s2>/<uuid>/, andpatient_data/coordination/<s1>/<s2>/<uuid>/(s1/s2 are first 4 hex chars). - Clinical repo seeded from validated clinical template directory (no symlinks; depth/size limits enforced); demographics repo holds FHIR-like
patient.json; coordination repo (Care Coordination Repository) holds encounters, episodes, appointments, and referrals. - Git repos per patient with signed commits (ECDSA P-256) where configured; single branch
main. - Clinical
ehr_statuslinks to demographics via external reference; coordination entries reference both clinical and demographics records.
API Surfaces (high level)
- gRPC service (tonic): health, patient creation, patient listing; API key interceptor expected on gRPC.
- REST service (axum/utoipa): mirrors gRPC behaviour; Swagger/OpenAPI exposed; currently open by default unless otherwise configured.
- Health endpoints on both transports; reflection optional for gRPC.
Configuration and Startup
- Env resolved once at startup in binaries/CLI, then passed via
CoreConfig:PATIENT_DATA_DIR,VPR_CLINICAL_TEMPLATE_DIR,RM_SYSTEM_VERSION,VPR_NAMESPACE, API key, bind addresses, reflection flag, dev guard for destructive CLI. - Startup flow (vpr-run): validate patient_data and template dirs, ensure shard subdirs exist (clinical, demographics, coordination), build config, launch REST and gRPC concurrently with
tokio::join.
Safety and Quality Bar
- Fail fast on invalid config/inputs; avoid panics on input-driven paths; no silent fallbacks.
- Respect architecture boundaries:
crates/coremust not read env; transports handle auth and request wiring. - Strong static typing: Prefer type safety over runtime checks. Use distinct types to encode invariants (e.g.,
ShardableUuidfor canonical UUIDs,Authorfor validated commit authors). Avoid stringly-typed data and primitive obsession; let the type system catch errors at compile time. - Add tests where rules live; wiring tests ensure errors propagate and side effects do not occur on failure.
- Use British English in prose and Rustdoc; prefer module-level
//!docs and function docs with# Arguments,# Returns,# Errorswhen applicable.
Security Expectations (for LLM-driven changes)
- Default to least privilege and minimise new attack surface; do not introduce new network listeners, env reads in
crates/core, or unsafe defaults. - Keep authentication posture aligned with project decisions: API key for gRPC (and REST when enabled); defer mTLS or other mechanisms to explicit user approval.
- Handle secrets safely in code and docs: avoid logging API keys, certificates, or patient identifiers; redact in logs and examples.
- Preserve commit-signing and integrity paths: do not weaken signature requirements or verification flows without explicit agreement.
- Avoid introducing PHI (Protected Health Information) into logs, test fixtures, or examples; prefer synthetic/non-identifying data.
- When adding dependencies, prefer well-maintained crates with permissive licenses; avoid unsafe code unless strictly necessary and justified.
Build, Test, and Tooling
- Primary check pipeline:
./scripts/check-all.sh(fmt, clippy -D warnings, check, test). - Docker dev:
docker compose -f compose.dev.yml up --buildorjust start-dev; health via grpcurl and REST /health. - Proto changes: edit
crates/api-shared/vpr.proto, rebuild to regenerate bindings.
Open Questions / Next Decisions
- Confirm scope: Is LLM limited to contributor assistance, or will user-facing LLM features (summaries/search) be added? If runtime features are desired, specify data access boundaries, PHI handling, and auditing requirements.
- Define authentication posture for REST (API key, mTLS, or other) to align with gRPC.
- Clarify expected commit-signing defaults (enforce vs optional) and how LLM-generated changes should treat signing in CI/local dev.
VPR Development Roadmap
Overview
Purpose:
This roadmap outlines the planned development work for the Versioned Patient Repository (VPR) – a Git-backed clinical record system designed to preserve verifiable clinical truth, authorship, and history over decades. VPR treats patient records as durable, inspectable artefacts with explicit provenance, rather than mutable database rows.
Guiding principles:
- Patients first and human readable
- Clinical truth is append-only and auditable
Phase Grouping
- Phase 1 – Foundations of Truth: Epics 1–3
- Phase 2 – Semantics and Meaning: Epics 4–6
- Phase 3 – Operational Reality: Epics 7–9
- Phase 4 – Access, Projections, and Record upload: Epics 10–15
Epic 1. Core Storage, Integrity, and Templates
Business Value:
Establishes the foundational storage and integrity model for VPR. Every patient record change is durable, inspectable, and tamper-evident.
- File-based patient record store with sharded layout and per-patient Git repositories (clinical + demographics separation)
- Clinical template seeding and validation at startup
- Commit signing optional in development environments
- Integrate cargo-audit into CI/CD
- Integrate cargo-deny into CI/CD
- Tighten traversal and allocation limits for patient discovery
- Implement retry and back-off strategy for filesystem and Git operations
- Validate all user-supplied identifiers and namespaces before side effects
- Add monitoring for template validation failures
- Conservative
git gcstrategy for per-patient repos - Enforce “no symlinks ever” policy across templates, imports, and repos
Epic 2. openEHR Alignment and Reference Model Semantics
Business Value:
Ensures long-term interoperability while preventing openEHR wire models from contaminating internal domain logic.
- Define supported openEHR RM versions and validation strategy
- Specify namespace formation and validation rules
- Publish RM/namespace compatibility matrix per deployment
- Validate
ehr_statuslinkage to demographics (external_ref) - Map clinical templates to openEHR archetype expectations
- Add RM/archetype validation or linting where practical
- Define supported artefact types:
-
ehr_status.yaml - Clinical letters (Markdown with YAML front matter)
- Documents (PDF with sidecar metadata)
- Structured messaging threads (YAML/JSON)
-
- Implement large-file-storage for binary artefacts (PDFs, images, scans) outside Git to preserve repository performance
- Support patient-contributed artefacts and annotations
- Explicitly document boundary between wire models and internal domain models
Epic 3. Demographics via FHIR
Business Value:
Separates patient identity from clinical truth while enabling interoperability.
- Separate demographics repository (FHIR-like
patient.json) - Implement demographics service parity with clinical service
- Validate demographics against selected FHIR profile
- Pagination and limits for demographics listing and queries
- Document demographics data contract and evolution strategy
Epic 4. Clinical Record Lifecycle and Semantic States
Business Value:
Removes ambiguity about what a clinical record means over time.
- Define lifecycle states (created, amended, corrected, superseded, closed)
- Define metadata conventions for lifecycle state
- Distinguish “wrong at the time” vs “correct then, obsolete now”
- Define closure and reopening semantics
- Document how consumers should interpret lifecycle state
- Explicitly document what VPR does not infer automatically
Epic 5. Temporal Semantics and Clinical Time
Business Value:
Ensures timestamps are clinically and legally interpretable.
- Define event time vs documentation time vs commit time
- Support retrospective documentation
- Define correction and amendment timing semantics
- Handle clock skew and external system timestamps
- Document required and optional temporal fields per artefact type
- Ensure Git commit time is never misrepresented as clinical event time
Epic 6. Logging, Auditability, and Provenance
Business Value:
Supports investigation, compliance, and forensic reconstruction.
- Define structured logging schema
- Enforce PHI redaction rules in logs
- Standardise error taxonomy
- Correlate operations with request IDs and commit hashes
- Log validation, security, and auth failures
- Log operational signals (retries, maintenance tasks)
- Document log retention, sinks, and access controls
Epic 7. Failure Modes and Recovery Semantics
Business Value:
Ensures predictable behaviour on bad days.
- Enumerate supported failure modes (partial writes, corruption, tampering)
- Classify failures (fatal, recoverable, operator intervention)
- Define system behaviour per failure class
- Define which failures must always be surfaced to operators
- Document guarantees around non-silent failure
Epic 8. Operational Hardening and Catastrophic Recovery
Business Value:
Ensures patient data survives hardware failure, human error, and attack.
- Define write-through backup strategy for patient repos
- Physically and administratively separate backup storage
- Offline cold backups at defined intervals
- Restore drills into clean environments
- Verify integrity and signatures on restore
- Define and document RPO and RTO targets
- Implement recovery marker commits with provenance
- Guarantee no silent history rewriting during restore
- Define encryption-at-rest and key management posture
- Finalise commit-signing policy for production
- Implement configurable signature verification on read paths
Epic 9. Governance, Authority, and Evolution Boundaries
Business Value:
Prevents architectural drift and unresolvable disputes.
- Define authority for RM version acceptance and deprecation
- Define schema evolution and incompatibility handling
- Document which decisions live outside the codebase
- Define escalation paths for semantic disputes
- Explicitly separate technical enforcement from organisational policy
Epic 10. Care Coordination and PAS-like Functions
Business Value:
Supports operational workflows without polluting clinical truth.
- Define coordination domain model (encounters, referrals, appointments)
- Implement Care Coordination Repository with Git-backed storage
- Link coordination artefacts to clinical and demographics records
- Define authorisation rules for coordination actions
- Define YAML schemas for coordination artefacts
- Support UX state (read/unread, task completion)
- Explicitly document non-authoritative status vs clinical record
Epic 11. API Transport, Auth, and Contracts
Business Value:
Provides secure, well-defined access to VPR.
- REST and gRPC transports with shared protobufs
- API key authentication for gRPC
- Configuration options to enable/disable gRPC and/or REST APIs independently (allow both, either, or neither)
- Disable reflection in production
- REST authentication parity with gRPC
- Optional mTLS design (future)
- Structured error models for REST and gRPC
- Pagination and validation for all listing APIs
- Secrets storage and rotation strategy
- API versioning and upgrade documentation
Epic 12. Read Models, Projections, and Performance
Business Value:
Improves performance without betraying truth.
- Define projection formats and cache semantics
- Explicitly mark projections as non-authoritative
- Ensure projections are disposable and rebuildable
- Link projections back to commit hashes
- Benchmark read and write paths under load
- Document acceptable projection lag
Epic 13. Patient Data Portability and Agency
Business Value:
Supports patient autonomy and regulatory compliance.
- Define patient download formats (full history vs snapshot)
- Implement authenticated download APIs
- Log and audit all patient downloads
- Define accepted upload formats and version compatibility
- Implement robust upload validation and sanitisation
- Reject symlinks, executables, and path traversal
- Support upload dry-run and preview
- Define merge and reconciliation strategies
- Log and audit all upload attempts
- Define trust boundaries for externally signed records
Epic 14. Education, Invariants, and Operational Literacy
Business Value:
Reduces institutional memory risk and misuse.
- Operator runbooks (backup, restore, failure handling)
- Developer invariants (what must never be violated)
- “What VPR does not do” documentation
- Shared mental model for contributors and operators
Epic 15. Core and Organisational Separation
Business Value:
Keeps the patient-record core reusable as a standalone library for patient/self-hosted deployments while allowing organisational layers (security, APIs, projections, back-office) to evolve independently without contaminating core invariants.
- Define the boundary for a standalone core library crate (patient data model, filesystem/Git, validation) that excludes organisational concerns.
- Document dependency direction: core must not depend on organisational code; organisational layers may depend on core.
- Identify organisational-only modules (authentication/authorisation, API transport, projections/cache, observability/ops) to reside outside the core crate.
- Evaluate packaging and repository split options (single repo with crates vs separate repositories) and their impact on versioning and CI.
- Plan migration and testing strategy for the split (CI matrices, contract tests, release cadence, documentation updates).
Code
Thanks to
The work that I have done with VPR is in many ways due to the time and effort that Dr Marcus Baw has put into his own version of a git file based patient record system call gitEHR. I openly admit to copying his ideas and implementations in my own approach to building VPR. Many thanks to Marcus for his pioneering work in this area.
Mark