Skip to content

FHIR and EHRbase staging issues

21 March 2026 — Staging infrastructure bootstrap

Summary

Staging returned "Database is initialising" because neither FHIR nor EHRbase containers were running on the Compute Engine VM. Six cascading issues were discovered and resolved.

Issues found

1. Wrong Docker image on Cloud Run

Cloud Run was serving an admin Docker image instead of the prod target. The deploy workflow built without --target prod, so the image lacked the production entrypoint.

Fix: Manually built with --target prod, pushed to Artifact Registry, and redeployed.

2. COS read-only filesystem breaks startup script

The VM uses Container-Optimised OS (COS), which has a read-only root filesystem. The startup script tried to install Docker Compose to /usr/local/lib/docker/cli-plugins/ and write config to /opt/quill/ — both fail silently.

Key COS constraints:

  • /usr/, /opt/, /root/read-only
  • /var/, /home/, /mnt/stateful_partition/ — writable but noexec (cannot execute binaries)
  • Docker Compose binary cannot be installed anywhere on COS

Fix: Bypassed Docker Compose entirely and used direct docker run commands. The startup script still needs a permanent redesign (either use docker run commands, run Compose via a container image, or switch to a standard VM image).

3. Missing PostgreSQL extension for EHRbase

EHRbase's Flyway migration V1__ehr.sql calls uuid_generate_v4(), which requires the uuid-ossp PostgreSQL extension. Cloud SQL does not create it by default.

Fix: Created the extension manually:

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

This was run via a disposable postgres:16-alpine container on the VM, connecting to Cloud SQL with the postgres admin user.

4. Missing EHRbase environment variables

EHRbase requires DB_USER_ADMIN and DB_PASS_ADMIN for Flyway to create schemas and run DDL during migrations. Without them, Flyway tries to authenticate with the literal string ${DB_USER_ADMIN}.

Fix: Added the missing env vars (postgres admin credentials) to the EHRbase container config.

5. EHRbase username mismatch

The backend config default EHRBASE_API_USER was ehrbase_user (underscore), but the Docker Compose reference used ehrbase-user (hyphen). This caused 401 responses when Cloud Run tried to authenticate.

Fix: Standardised on ehrbase_user (underscore) everywhere — both the EHRbase container's SECURITY_AUTHUSER and the backend's config default.

6. EHRbase startup timing

EHRbase takes 30–110 seconds to start (Flyway migrations, Spring Boot init, HikariPool). Health checks hitting EHRbase during startup received 401 because Spring Security wasn't initialised yet. This was misdiagnosed as a credentials issue during debugging.

Fix: Allowed sufficient startup time. The Cloud Run startup probe was also updated with timeout_seconds = 15 (was defaulting to 1 second) to prevent premature failure.

Infrastructure changes made

Change File Status
Startup probe timeout (1s → 15s) infra/modules/cloud-run/main.tf Committed (ed5b39b)
COS startup script path fixes infra/modules/compute-fhir/startup.sh Committed (ed5b39b, needs further work)
uuid-ossp extension on Cloud SQL quill-ehrbase-staging instance Applied manually
EHRbase container with correct env vars fhir-ehrbase-staging VM Running via docker run
FHIR container fhir-ehrbase-staging VM Running via docker run

Remaining actions

See plans/todo.md for the tracked to-do list.