Teaching feature — implementation plan¶
Add a config-driven MCQ assessment engine. Each question bank is defined by a root YAML config file (e.g. config.yaml) that specifies the MCQ type, images, answer options, tagging, and compound pass criteria. The engine supports multiple MCQ types — uniform (every item has the same fixed structure) and variable (each item defines its own images, text, and options) — so the same system can host image-classification assessments, traditional MCQs, or mixed formats. New formats can be added in the future. No restructuring of existing EPR code. Teaching gets its own features/teaching/ directories in both backend and frontend, gated by a new OrganisationFeature model. Same codebase deploys to separate GCP projects via environment config. Question bank with images are version-controlled in a separate private GitHub repo (with Git LFS for binaries), then synced to a GCS bucket during CI/CD deployment and served via signed URLs at runtime.
Convention: new optional functionality goes in
features/. Existing code (e.g. messaging) will be migrated intofeatures/later when it becomes gated.
Question bank config format¶
Questions are stored in in a private repository at bailey-medics organisation on github called quill-question-bank. Each question bank is stored in questions/<question-bank-name>/, eg questions/colonoscopy-optical-diagnosis/. In this folder is a config.yaml file, setting out what type of questions are contained in the folder. Questions are in separate subfolders, with their own question.yml file and and associated pictures, as below:
questions/
colonoscopy-optical-diagnosis/
config.yml
certificate_background.pdf
question_1/
question.yaml
image_1.png
image_2.png
question_2/
question.yaml
image_1.png
image_2.png
Each question bank can include a certificate_background.pdf — a PDF template with the institution branding, borders, and logos. The certificate endpoint overlays dynamic text (candidate name, institution, date, score, question bank title) onto defined text areas on this background. If no background is provided, a default plain template is used.
The config.yml file contains the below:
id: colonoscopy-optical-diagnosis
version: 1
title: "Optical Diagnosis of Diminutive Colorectal Polyps"
description: >
Assess colonoscopists' ability to optically diagnose diminutive (≤5mm)
colorectal polyps using white light and narrow band imaging.
type: uniform # all items share the same image count + options (see MCQ types below)
images_per_item: 2
image_labels:
- "White light (WLI)"
- "Narrow band imaging (NBI)"
# item_text not used for this bank — polyp images speak for themselves.
# Future banks can add: item_text: { label: "Patient history", required: true }
options:
- id: high_confidence_adenoma
label: "High Confidence Adenoma"
tags: [high_confidence, adenoma]
- id: low_confidence_adenoma
label: "Low Confidence Adenoma"
tags: [low_confidence, adenoma]
- id: high_confidence_serrated
label: "High Confidence Serrated Polyp"
tags: [high_confidence, serrated_polyp]
- id: low_confidence_serrated
label: "Low Confidence Serrated Polyp"
tags: [low_confidence, serrated_polyp]
correct_answer_field: diagnosis # item metadata key holding the right answer
correct_answer_values: # valid values (used to validate educator uploads)
- adenoma
- serrated
# Correctness check at scoring time:
# 1. Candidate picks "high_confidence_adenoma" → option tags are [high_confidence, adenoma]
# 2. Engine strips confidence tags → remaining tag = "adenoma" (the diagnosis)
# 3. Compares "adenoma" against item's metadata["diagnosis"]
# 4. Match → correct; mismatch → incorrect
assessment:
items_per_attempt: 120
time_limit_minutes: 75
min_pool_size: 200
randomise_selection: true
randomise_order: true
allow_immediate_retry: true
intro_page:
title: "Before you begin"
body: |
You will be shown 120 polyp images, each displayed as a pair:
white light (WLI) and narrow band imaging (NBI).
For each image pair, select the single best answer from the four options.
**Time limit**: 75 minutes. The timer starts when you click "Begin".
**Marking criteria**:
- ≥70% of your answers must be **high confidence**
- ≥85% of your high-confidence answers must be **correct**
You must meet **both** criteria to pass.
closing_page:
title: "Assessment complete"
body: |
Your answers have been submitted.
Your scores and pass/fail result are shown below.
Results have also been emailed to the assessment coordinator.
Thank you for completing the optical diagnosis assessment.
pass_criteria:
- name: "High confidence rate"
description: "≥70% of answers must be high-confidence"
rule: tag_percentage
tag: high_confidence
threshold: 0.70
- name: "High confidence accuracy"
description: "≥85% of high-confidence answers must be correct"
rule: tag_accuracy
tag: high_confidence
threshold: 0.85
results:
certificate_download: true # candidate can download a PDF certificate on pass
certificate_background: certificate_background.pdf # PDF template in the bank folder
certificate_text_areas:
- field: candidate_name
x: 300
y: 400
font_size: 24
- field: date
x: 300
y: 450
font_size: 16
- field: institution
x: 300
y: 500
font_size: 16
- field: score_summary
x: 300
y: 550
font_size: 14
email_notification: true # send result to a coordinator email address
email_subject: "Optical Diagnosis MCQ — Assessment Result"
# The recipient email address is set per-organisation in the admin UI,
# NOT in this config file — no email addresses in the codebase.
Config field reference¶
| Field | Type | Purpose |
|---|---|---|
id |
string | Unique identifier, matches filename |
version |
int | Bank version — bump when correcting items (see versioning below) |
title |
string | Display title in UI |
description |
string | Shown on assessment dashboard |
type |
string | MCQ type: uniform or variable (see MCQ types section) |
images_per_item |
int | (uniform only) Fixed number of images per question |
image_labels |
list[str] | (uniform only) Label for each image slot (displayed above image) |
item_text |
object? | Optional per-item text block shown below images |
item_text.label |
string | Heading displayed above the text (e.g. "Patient history") |
item_text.required |
bool | Whether educators must provide text when authoring items |
options |
list | (uniform only) Answer choices, each with id, label, tags |
options[].tags |
list[str] | (uniform only) Tags used by scoring rules |
correct_answer_field |
string | (uniform only) Item metadata key that holds the correct answer |
correct_answer_values |
list[str] | (uniform only) Valid values for that field |
assessment.items_per_attempt |
int | Questions per assessment |
assessment.time_limit_minutes |
int | Server-enforced time limit |
assessment.min_pool_size |
int | Minimum published items to start an assessment |
assessment.randomise_selection |
bool | Randomly draw from pool |
assessment.randomise_order |
bool | Randomise presentation order |
assessment.allow_immediate_retry |
bool | Can retry immediately after failure |
assessment.intro_page |
object? | Optional page shown before the first question |
assessment.intro_page.title |
string | Heading for the intro page |
assessment.intro_page.body |
string | Markdown body — instructions, marking criteria, time limit |
assessment.closing_page |
object? | Optional page shown after the last answer, before results |
assessment.closing_page.title |
string | Heading for the closing page |
assessment.closing_page.body |
string | Markdown body — submission confirmation, next steps |
pass_criteria |
list | Compound rules — ALL must pass |
pass_criteria[].rule |
string | tag_percentage or tag_accuracy (extensible) |
pass_criteria[].tag |
string | Which option tag the rule filters on |
pass_criteria[].threshold |
float | Required minimum (0.0–1.0) |
results.certificate_download |
bool | Candidate can download a PDF certificate on pass |
results.certificate_background |
string? | Filename of PDF background template (e.g. certificate_background.pdf) |
results.certificate_text_areas |
list? | Text area positions on certificate: [{field, x, y, font_size}] |
results.email_notification |
bool | Send result to a coordinator email (address set in admin UI) |
results.email_subject |
string? | Subject line for notification email (required if email enabled) |
Versioning¶
The version field (integer, starting at 1) ties items and assessments to a specific snapshot of the question bank. When an error is found — e.g. a polyp image has the wrong diagnosis — the workflow is:
- Bump
versionin the YAML config (e.g. 1 → 2) - Commit corrected items under the new version via Git PR (or re-publish existing items unchanged)
- New assessments are created against version 2 and draw only from version-2 items
- Completed assessments remain tied to version 1 — their scores and pass/fail results are unchanged
Both QuestionBankItem.bank_version and Assessment.bank_version record the version, so historical results are always traceable to the exact item set that was used. The admin UI shows which version is current and lists past versions with their assessment counts.
Scoring engine rules¶
tag_percentage: of all answers, what fraction have the specified tag? Must be ≥ threshold. Example: ≥70% of answers must be taggedhigh_confidence.tag_accuracy: of answers that have the specified tag, what fraction are correct? Must be ≥ threshold. Example: ofhigh_confidenceanswers, ≥85% must match the item's correct diagnosis.
New rule types can be added to the scoring engine without changing the config schema (e.g. overall_accuracy, minimum_correct_count).
MCQ types¶
The type field determines the structural contract between the config and the items stored in the database. The frontend uses it to select the right rendering component.
uniform¶
Every item has the same structure: fixed image count, fixed image labels, and a shared set of options. The config defines images_per_item, image_labels, and options at the top level. Individual items store only their images, optional text, and metadata (correct answer).
- Use when: all questions are structurally identical (e.g. colonoscopy MCQ — always 2 images, always 4 options)
- Frontend:
QuestionViewrenders a consistent layout for every item — N images side-by-side with labels, optional text, shared radio options - Authoring: educator provides N images + metadata per item via Git PR; options are already defined in config
- Scoring: tag-based rules work because every option has the same tags across all items
variable¶
Each item defines its own images, text, and options. The config sets only assessment-level parameters (timing, pool size, pass criteria). Individual items store everything needed to render and score them.
- Use when: questions vary in format (e.g. some have 0 images, some have 3; each question has unique answer choices)
- Frontend:
QuestionViewadapts layout per item — renders whatever images, text, and options the item provides - Authoring: educator provides all content per item via Git PR: images (0+) with labels, question text, options with tags, and the correct answer
- Scoring: tag-based rules still work — options still have tags, but they’re defined per-item rather than globally
- Item-level fields (stored in the
QuestionBankItemrow, not the config): images: list of{key, label}objects (0 or more)text: question/scenario text (optional, depending onitem_text.required)options: list of{id, label, tags}— same structure asuniform, but per-itemcorrect_option_id: which option is correct (replacesmetadata+correct_answer_fieldlookup)
Implementation note: the
QuestionBankItemmodel uses the same table for both types. Foruniformitems,optionsis null (read from config) andimagesstores[{"key": "image_1.png"}, ...](labels read from config’simage_labels). Forvariableitems,optionsandimages(with{key, label}objects) are populated per-row fromquestion.yaml. The API and frontend checktypeto know where to read options and image labels from.
The colonoscopy optical diagnosis MCQ could not use off-the-shelf platforms because of the compound pass criteria — most platforms support only a single overall percentage threshold. The config-driven approach with pluggable MCQ types means any future question bank — whether uniform image-classification (like colonoscopy) or variable mixed-format MCQs — can be added with just a YAML file and content uploads.
question.yaml¶
Each question_<n>/ subfolder in the external question bank repo contains a question.yaml with item-level metadata, and the associated image files.
For uniform type questions, the YAML is minimal — only the metadata fields defined by correct_answer_field and correct_answer_values in the bank config:
# questions/colonoscopy-optical-diagnosis/question_001/question.yaml
diagnosis: adenoma
Image filenames within the folder must match the pattern image_<n>.<ext> where <n> corresponds to the position in image_labels (1-indexed). For the colonoscopy bank: image_1.png = WLI, image_2.png = NBI.
For variable type questions, the question.yaml file includes the full item definition:
# questions/medication-safety/question_001/question.yaml
text: "A 72-year-old patient with CKD stage 4 is prescribed..."
images: [] # explicitly empty — this question has no images
options:
- id: reduce_dose
label: "Reduce dose by 50%"
tags: [correct]
- id: no_change
label: "No dose adjustment needed"
tags: [incorrect]
- id: stop_drug
label: "Stop the medication"
tags: [incorrect]
correct_option_id: reduce_dose
For variable items with images, the images field must list every image file with its label. The validator checks that each listed image file exists in the subfolder and that no unlisted image_<n>.* files are present:
# questions/radiology-basics/question_001/question.yaml
text: "A 45-year-old patient presents with chest pain..."
images:
- key: image_1.png
label: "PA chest X-ray"
- key: image_2.png
label: "Lateral chest X-ray"
options:
- id: pneumothorax
label: "Pneumothorax"
tags: [correct]
- id: pleural_effusion
label: "Pleural effusion"
tags: [incorrect]
correct_option_id: pneumothorax
Colonoscopy MCQ — first question bank¶
Context: Optical diagnosis of diminutive (≤5mm) colorectal polyps. Colonoscopists trained on high-resolution endoscopes can visually classify polyps and discard them without histological analysis — saving ~£30 per polyp. Currently accredited only within the bowel cancer screening programme (~15% of procedures). This MCQ extends accreditation to the remaining ~85% of symptomatic colonoscopists.
Assessment format (defined by the config above):
- 120 polyps per attempt, randomly selected from a pool of ~200
- Each polyp displayed as two side-by-side images: white light (WLI) and narrow band imaging (NBI)
- 4 answer choices per polyp (high/low confidence × adenoma/serrated)
- 75-minute time limit
- Random selection + random order means no two attempts are identical
Pass criteria (compound — both must be met):
- ≥70% High Confidence — at least 84 of 120 answers must be high-confidence
- ≥85% Accuracy of High Confidence — of those high-confidence answers, at least 85% must match the correct diagnosis
User flow:
- Admin creates candidate account (Name, Institution, Work Email) and assigns to teaching org
- Candidate logs in, selects the colonoscopy optical diagnosis question bank
- Starts assessment → 120 random polyps presented sequentially
- For each polyp: views WLI + NBI images, selects one of 4 options
- On completion (or time expiry): sees result with score breakdown
- Can immediately retry if failed
- Result also sent to a central email for accreditation tracking
Current state vs proposal¶
| Area | Current codebase | Change needed |
|---|---|---|
| Backend structure | Flat — all routes in main.py, models in models.py |
Add app/features/teaching/ with own router, models, schemas |
| Config | FHIR_SERVER_URL and EHRBASE_URL are hardcoded required strings; FHIR/EHRbase DB passwords are required SecretStr |
Add CLINICAL_SERVICES_ENABLED: bool = True flag. Keep existing URL defaults and required passwords. Teaching sets flag to False. |
| Organisation features | Organization model exists, no feature gating |
New OrganisationFeature model — runtime feature flags per org |
| CBAC | 34 clinical competencies, 18 professions | Add teaching competencies and professions (learner, educator) |
| Image storage | No object storage anywhere | Images version-controlled in Git (LFS), synced to GCS bucket on deploy, served via signed URLs |
| Frontend routes | Statically imported in main.tsx (~40 eager imports) |
Add teaching routes under /teaching/*, gated by org feature |
| Navigation | Static links in SideNavContent.tsx |
Conditionally show teaching nav items |
| Terraform | Teaching env already configured with enable_fhir = false |
Add GCS bucket, CI/CD image sync step, storage env vars to Cloud Run |
Phase 1: OrganisationFeature + config changes¶
Step 1.1 — OrganisationFeature model¶
All-explicit model: no OrganisationFeature rows = no access. Features must be explicitly enabled per organisation. There are no implicit defaults — a brand-new org has zero features until an admin enables them.
Add to backend/app/models.py:
OrganisationFeaturetable withid(UUID PK),organisation_id(FK → organizations.id),feature_key(str, e.g. "epr", "teaching", "messaging"),enabled_at(datetime),enabled_by(FK → users.id)- Row existence = feature is enabled. Deleting the row disables the feature. No separate
enabledboolean — avoids ambiguity between "row exists but disabled" and "row absent". - Unique constraint on
(organisation_id, feature_key) - Relationship from
Organization→OrganisationFeature(one-to-many)
Migration: just migrate "add_organisation_features_table"
Data migration: the same migration must seed OrganisationFeature rows for all existing organisations — enable epr, messaging, and letters on every current org so existing deployments are unaffected.
Admin UI: the organisation creation form (pages/admin/organisations/) must include a feature checklist so admins choose which features to enable at creation time.
Files: backend/app/models.py, new migration in alembic/versions/
Step 1.2 — Make FHIR/EHRbase conditionally required¶
Modify backend/app/config.py:
- Add
CLINICAL_SERVICES_ENABLED: bool = True— a single flag that controls whether FHIR and EHRbase are required. FHIR and EHRbase are always enabled/disabled together (EHRbase depends on FHIR for patient context), so there is no reason to toggle them independently. - Give FHIR/EHRbase passwords sensible defaults (matching Docker Compose dev values) instead of leaving them required with no default. e.g.
FHIR_DB_PASSWORD: SecretStr = SecretStr("fhir_password"). This follows the same pattern asFHIR_DB_USER,FHIR_DB_HOST, etc. which already have defaults. With defaults on all fields, the app starts cleanly in both modes without needingNonetyping. Keep existing URL defaults (FHIR_SERVER_URL: str = "http://fhir:8080/fhir"). - Add a Pydantic
model_validator(mode="after"): ifCLINICAL_SERVICES_ENABLEDisTrue, verify FHIR/EHRbase URLs and passwords are present. This is belt-and-braces: catches misconfigurations at startup rather than at runtime. - Teaching deployment sets
CLINICAL_SERVICES_ENABLED=falsein env vars. Because the flag defaults toTrue, any EPR deployment that forgets it still gets FHIR/EHRbase — the safe default. Production overrides all defaults via env vars. - Guard
FHIR_DATABASE_URL/EHRBASE_DATABASE_URLproperties to returnNonewhen disabled
Clinical safety note: defaults are EPR-centric. An EPR deployment with zero config changes gets CLINICAL_SERVICES_ENABLED=True — clinical services are on by default. Only an explicit = false disables them.
Files: backend/app/config.py
Step 1.3 — Guard existing FHIR/EHRbase calls¶
Depends on 1.2¶
backend/app/fhir_client.py— Guard initialisation; raiseHTTPException(503)if called when disabledbackend/app/ehrbase_client.py— Same pattern- Patient routes in
main.pythat call FHIR (demographics, letters) — return 503 when FHIR disabled - Health check endpoints — report FHIR/EHRbase as "not provisioned" rather than erroring
Files: backend/app/fhir_client.py, backend/app/ehrbase_client.py, backend/app/main.py
Step 1.4 — OrganisationFeature API endpoints¶
Depends on 1.1
GET /api/organizations/{id}/features— List enabled features for an org (admin only)PUT /api/organizations/{id}/features/{feature_key}— Enable/disable feature (admin only)- Extend
GET /api/auth/meresponse to includeenabled_features: list[str]for the user's primary org
Files: backend/app/main.py, new backend/app/schemas/features.py
Step 1.5 — requires_feature FastAPI dependency¶
Depends on 1.1
Reusable dependency for gating routes by feature (same pattern as existing has_competency() in backend/app/cbac/decorators.py):
def requires_feature(feature_key: str) -> Callable:
"""FastAPI dependency: checks user's org has feature enabled. Returns 403 if not."""
Resolves: user → primary org → OrganisationFeature → 403 if disabled.
Files: new backend/app/features/__init__.py
Phase 2: Teaching domain models + storage¶
Step 2.1 — Teaching models¶
Depends on 1.1
Create backend/app/features/teaching/models.py:
QuestionBankItem (one item in a question bank — e.g. one polyp with its images)
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
organisation_id |
UUID (FK) | Owning org |
question_bank_id |
str | Matches config id (e.g. "colonoscopy-optical-diagnosis") |
bank_version |
int | Version of the question bank this item belongs to |
images |
JSON list[dict] | [{"key": "image_1.png"}] for uniform; [{"key": "image_1.png", "label": "CT scan"}] for variable |
text |
str | None | Optional free-text shown below images (e.g. patient history) |
options |
JSON list | None | Per-item options (variable type only); null for uniform (read from config) |
correct_option_id |
str | None | Correct option id (variable type only); null for uniform (uses metadata + correct_answer_field) |
metadata |
JSON dict | Correct answer + any extra fields (e.g. {"diagnosis": "adenoma"}) |
status |
str | "draft" / "published" |
created_by |
UUID (FK → users.id) | User who ran the sync command |
created_at |
datetime | Auto-set |
The metadata field is freeform JSON validated against the question bank config at upload time. For the colonoscopy MCQ, it would be {"diagnosis": "adenoma"} or {"diagnosis": "serrated"}. The correct_answer_field in the config tells the scoring engine which metadata key to check.
Assessment (one attempt at a question bank)
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
user_id |
UUID (FK → users.id) | Candidate |
organisation_id |
UUID (FK) | Org context |
question_bank_id |
str | Which question bank config this assessment uses |
bank_version |
int | Version of the bank when assessment was started |
started_at |
datetime | When assessment began |
completed_at |
datetime or None | When submitted / timed out |
time_limit_minutes |
int | Copied from config at creation (e.g. 75) |
total_items |
int | Copied from config at creation (e.g. 120) |
score_breakdown |
JSON dict or None | Per-criterion results computed on completion |
is_passed |
bool or None | Null until completed |
score_breakdown stores the result of each pass criterion, e.g.:
{
"criteria": [
{
"name": "High confidence rate",
"value": 0.73,
"threshold": 0.7,
"passed": true
},
{
"name": "High confidence accuracy",
"value": 0.88,
"threshold": 0.85,
"passed": true
}
],
"overall_passed": true
}
AssessmentAnswer (one answer within an assessment)
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
assessment_id |
UUID (FK) | Parent assessment |
item_id |
UUID (FK) | Which QuestionBankItem |
display_order |
int | Position in this assessment (1–N) |
selected_option |
str or None | Option id from config; null until answered |
is_correct |
bool or None | Null until answered; set when answer is submitted |
resolved_tags |
JSON list[str] or None | Tags resolved from the selected option at answer time (audit trail) |
answered_at |
datetime or None | When answered |
QuestionBankConfig (cached config from the question bank repo)
The backend pulls config from the GCS bucket (where it was synced from the question bank repo) and stores the parsed YAML in the database. Images stay in the bucket; only text/YAML data is persisted to the DB. This avoids requiring a clone of the question bank repo at runtime.
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
organisation_id |
UUID (FK) | Owning org |
question_bank_id |
str | Matches config id (e.g. "colonoscopy-optical-diagnosis") |
version |
int | Config version (matches version field in config.yaml) |
title |
str | Display title |
description |
str | Shown on assessment dashboard |
type |
str | "uniform" or "variable" |
config_yaml |
JSON dict | Full parsed config.yaml content (options, pass_criteria, assessment, etc) |
synced_at |
datetime | When this config was last pulled from the bucket |
synced_by |
UUID (FK → users.id) | Who triggered the sync |
Unique constraint on (organisation_id, question_bank_id, version). The GET /api/teaching/question-banks endpoint reads from this table — no filesystem or bucket access needed at request time.
TeachingOrgSettings (per-organisation teaching configuration)
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
organisation_id |
UUID (FK) | Unique — one settings row per org |
coordinator_email |
str | Recipient email for assessment result notifications |
institution_name |
str | Institution name for certificate generation and accreditation tracking |
Unique constraint on organisation_id. The admin UI for teaching organisations includes fields to set the coordinator email and institution name. The POST /api/teaching/results/email endpoint reads the coordinator email from this table.
QuestionBankSync (sync history for audit and the SyncStatus page)
| Field | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
organisation_id |
UUID (FK) | Owning org |
question_bank_id |
str | Which bank was synced |
version |
int | Version that was synced |
status |
str | "success" / "failed" / "in_progress" |
items_created |
int | Number of new items imported |
items_updated |
int | Number of existing items updated |
errors |
JSON list[dict] | Validation errors (empty on success) |
warnings |
JSON list[dict] | Validation warnings (non-blocking) |
started_at |
datetime | When sync began |
completed_at |
datetime or None | When sync finished (null if in progress or crashed) |
triggered_by |
UUID (FK → users.id) | Who triggered the sync |
The SyncStatus page reads from this table to show the last sync result, validation errors, and item counts per bank.
Per-answer scoring (answers scored individually on submission):
Each answer is scored immediately when submitted via POST /assessments/{id}/answer. The endpoint resolves the selected option's tags and correctness, then persists is_correct and resolved_tags on the AssessmentAnswer row. This gives a complete per-answer audit trail — if a question bank item is later found to have a wrong diagnosis, affected answers can be identified by querying AssessmentAnswer rows by item_id without re-running the scoring engine.
Tag resolution differs by MCQ type:
uniform: the selected option'stagscome from the config'soptionslist (shared across all items). Correctness is checked by comparing the option's non-confidence tags against the item'smetadata[correct_answer_field].variable: the selected option'stagscome from the item's ownoptionslist. Correctness is checked by comparingselected_optionagainst the item'scorrect_option_id.
Assessment completion (aggregate scoring on POST /assessments/{id}/complete):
The scoring engine reads the already-scored AssessmentAnswer rows and evaluates the config's pass_criteria rules. Because is_correct and resolved_tags are already persisted per-answer, the completion step is purely an aggregation — no re-evaluation of individual answers.
- Read all
AssessmentAnswerrows for the assessment whereselected_optionis not null (only submitted answers count) tag_percentagerule: count answered items where option has the specified tag ÷total_items. Must be ≥ threshold. Unanswered items have no tags and count against this threshold.tag_accuracyrule: of answered items that have the specified tag, count those whereis_correct = true. Must be ≥ threshold.is_passed= ALL criteria pass
No auto-complete on timeout: the assessment is not automatically completed when the timer expires. The candidate must explicitly submit via POST /assessments/{id}/complete (or the frontend triggers this when the timer reaches zero). Unanswered items are simply not scored — they have no selected_option, no is_correct, no resolved_tags. They still count against tag_percentage rules (denominator is always total_items), which naturally penalises incomplete assessments.
Example (colonoscopy MCQ, uniform): high_confidence_adenoma has tags [high_confidence, adenoma]. Item metadata is {"diagnosis": "adenoma"}. When the candidate submits this answer, is_correct is set to true and resolved_tags is set to ["high_confidence", "adenoma"]. At completion, the high_confidence tag contributes to the 70% threshold; is_correct = true counts towards the 85% accuracy threshold.
Import these in alembic/env.py so Alembic detects them for migration generation.
Migration: just migrate "add_teaching_tables"
Files: new backend/app/features/teaching/__init__.py, new backend/app/features/teaching/models.py, backend/alembic/env.py
Step 2.2 — Image storage (Git + GCS + CDN)¶
Parallel with 2.1
Images are version-controlled in Git (source of truth, PR review) and served via Cloud CDN + signed URLs in production (fast, scalable, secure). The backend never proxies images — it generates short-lived signed URLs and the browser fetches directly from Google's edge network.
Architecture overview¶
quill-question-bank repo (source of truth)
│
▼ CI/CD (GitHub Actions)
│
┌────┴────┐
│ │
▼ ▼
GCS Backend (Cloud Run)
bucket reads YAML → DB sync (Option A: download to tempdir)
│
▼
Cloud CDN (edge cache, ~150 locations)
│
▼
Browser loads images directly (signed URL, 15 min expiry)
| Layer | Local dev | Production |
|---|---|---|
| Image storage | Local question-bank/ folder (Docker mount) |
GCS bucket |
| Image serving | FastAPI StaticFiles at /api/teaching/images/ |
Cloud CDN + signed URLs |
| DB sync source | Local folder (TEACHING_QUESTION_BANK_PATH) |
GCS bucket → tempdir → existing sync code |
| Config | TEACHING_QUESTION_BANK_PATH=/question-banks |
TEACHING_GCS_BUCKET=quill-images-teaching |
Why CDN + signed URLs at enterprise scale: the backend should never serve images. A single Cloud Run instance handling 1000 concurrent assessments (2 images each) would choke if proxying images. Cloud CDN handles it trivially from edge locations worldwide. Signed URLs expire after 15 minutes, preventing hotlinking and permanent sharing.
Git side — source of truth¶
Question bank content lives in a separate private repository (bailey-medics/quill-question-bank), not in the main application repo. This keeps large binary files (polyp images etc.) out of the application codebase while maintaining version control and PR-based review. Git LFS tracks binary files so the repo stays fast.
For local development, the question bank repo is cloned into a question-bank/ folder within the quillmedical workspace. This folder is gitignored so question bank content is never committed to the main application repo. Justfile commands manage the clone/pull lifecycle:
just question-bank-clone # git clone into question-bank/
just question-bank-pull # pull latest content
just question-bank-push # push changes (educator workflow)
The .gitignore entry:
question-bank/
The directory structure within the question bank repo follows the format described in the Question bank config format section:
question-bank/ ← gitignored, cloned from quill-question-bank
questions/
colonoscopy-optical-diagnosis/
config.yaml # bank config (type, options, pass criteria)
certificate_background.pdf # optional certificate template
question_001/
question.yaml # item metadata (e.g. diagnosis: adenoma)
image_1.png # WLI image
image_2.png # NBI image
question_002/
question.yaml
image_1.png
image_2.png
...
The config.yaml is the bank-level config documented above. Each question_<n>/ subfolder is one item, containing a question.yaml (item metadata) and image files named image_<n>.<ext> matching the config's image_labels order.
Add .gitattributes in the question bank repo:
questions/*/question_*/image_*.* filter=lfs diff=lfs merge=lfs -text
Local dev — filesystem serving¶
Docker Compose mounts the question bank folder into the backend container:
# compose.dev.yml (backend service)
volumes:
- ./question-bank/questions:/question-banks
environment:
TEACHING_QUESTION_BANK_PATH: /question-banks
TEACHING_IMAGES_BASE_URL: /api/teaching/images
The backend conditionally mounts a StaticFiles endpoint at /api/teaching/images/ from the TEACHING_QUESTION_BANK_PATH directory (only when TEACHING_STORAGE_BACKEND=local). This goes through the existing Caddy /api/* proxy with no Caddyfile changes needed.
LocalStorageBackend generates URLs like /api/teaching/images/colonoscopy-optical-diagnosis/question_42/image_1.png.
The sync endpoint resolves the bank path from the TEACHING_QUESTION_BANK_PATH config setting + the bank_id from the request body — it does not accept arbitrary filesystem paths (prevents path traversal).
GCS side — production serving¶
Each GCP project has a dedicated GCS bucket (e.g. quill-images-teaching). The CI/CD pipeline (GitHub Actions on the quill-question-bank repo) syncs content to the bucket:
# In CI/CD step on quill-question-bank repo:
gsutil -m rsync -r questions/ gs://$TEACHING_GCS_BUCKET/questions/
The backend generates signed URLs (15-minute expiry) when serving items to candidates. The frontend never accesses GCS directly — it receives signed URLs from the API. Cloud CDN sits in front of the bucket for edge caching.
GCSStorageBackend generates URLs like https://storage.googleapis.com/quill-images-teaching/questions/colonoscopy.../question_42/image_1.png?X-Goog-Signature=....
GCS DB sync — Option A (download to tempdir)¶
In production, the sync endpoint needs to read config.yaml and question.yaml files from the GCS bucket to populate the database. Rather than rewriting sync.py to understand GCS natively, the sync endpoint downloads the YAML files from the bucket to a temporary directory, then passes that Path to the existing sync_question_bank() function. Images are not downloaded — they stay in the bucket and are only referenced by key in the database.
This keeps sync.py simple and filesystem-based while supporting both local and GCS environments. The tempdir is cleaned up after the sync completes.
Config settings¶
Add to config.py:
TEACHING_GCS_BUCKET: str | None = None— GCS bucket name (set per environment)TEACHING_IMAGES_BASE_URL: str | None = None— base URL for local dev image servingTEACHING_QUESTION_BANK_PATH: str | None = None— local filesystem path to question bank content
Create backend/app/features/teaching/storage.py:
get_image_url(bank_id: str, item_folder: str, filename: str) → str— returns a signed GCS URL in production, or a local file URL in dev- Uses
google-cloud-storagefor signed URL generation
Add google-cloud-storage and reportlab to pyproject.toml dependencies.
Why this hybrid approach?
- Version control: image changes tracked in Git history, tied to the YAML version bump in one PR
- PR review: educators submit images via PR → reviewers can inspect before merge
- Single source of truth: config, images, and manifest are co-located in the repo
- Fast serving: Cloud CDN + signed URLs — no load on the backend for image delivery
- Security: signed URLs expire, so images can't be hotlinked or shared permanently
- Dev parity: same
StorageBackendabstraction hides the difference — the router callsstorage.get_image_url()and doesn't care whether it gets a local path or a signed CDN URL
Files: .gitattributes (in question bank repo), .gitignore, Justfile, compose.dev.yml, backend/app/config.py, backend/app/main.py, backend/pyproject.toml, new backend/app/features/teaching/storage.py, new backend/app/features/teaching/sync.py, new backend/app/features/teaching/validate.py
Step 2.3 — Question bank validation¶
Parallel with 2.1
A dedicated validation module (backend/app/features/teaching/validate.py) checks the structural and semantic integrity of question bank content in the external repo. This runs in three contexts:
- CI on the question bank repo — a GitHub Action runs validation on every PR/push to catch errors before merge
- Dry-run endpoint —
POST /api/teaching/items/validatelets educators check content without importing - Pre-sync gate — the sync command (
just sync-question-bank) runs validation as its first step and aborts on any error
Structural checks (per question bank)¶
| Check | Rule |
|---|---|
| Config present | questions/<bank-id>/config.yaml must exist |
| Config schema valid | config.yaml must parse against the Pydantic QuestionBankConfig schema (all required fields present) |
| Subfolder naming | Item subfolders must match question_<n>/ pattern (sequential integers, 1-indexed, zero-padded to 3+) |
| Question YAML present | Each question_<n>/ must contain exactly one question.yaml |
| No stray files | No unexpected files in bank root or item subfolders (only config.yaml, certificate_background.pdf, question_<n>/ dirs) |
Content checks (per item, validated against config)¶
| Check | uniform type |
variable type |
|---|---|---|
| Image count | Exactly images_per_item image files (image_1.*, image_2.*, ...) per subfolder |
images list in question.yaml is required (may be empty []). Each listed {key, label} must have a matching file in the subfolder. No unlisted image_<n>.* files allowed. |
| Image format | Allowed extensions: .png, .jpg, .jpeg, .webp |
Same |
| Metadata field present | question.yaml must contain correct_answer_field key (e.g. diagnosis: adenoma) |
question.yaml must contain correct_option_id, options list, and images list |
| Metadata value valid | Value of the answer field must be in config's correct_answer_values list |
correct_option_id must match one of the item's options[].id values |
| Options valid | N/A (options defined in config) | Each option must have id, label, tags; no duplicate ids |
| Item text | If config has item_text.required: true, question.yaml must include text field |
Same |
| Tags consistency | N/A (tags defined in config) | Tags referenced in pass_criteria must appear in at least one item's options |
Cross-item checks (whole bank)¶
| Check | Rule |
|---|---|
| Minimum pool size | Total item count must be ≥ assessment.min_pool_size (warning if below, error blocks sync) |
| No duplicate item IDs | Subfolder numbers must be unique and sequential |
| Version consistency | All items in a sync batch belong to the same version as declared in config.yaml |
| Answer distribution | Warning (not error) if answer distribution is heavily skewed (e.g. >80% of items have the same correct answer) |
Validation output¶
The validator returns a structured result:
@dataclass
class ValidationResult:
bank_id: str
version: int
is_valid: bool # False if any errors
errors: list[ValidationError] # blocking — sync will not proceed
warnings: list[ValidationWarning] # non-blocking — logged but sync continues
item_count: int
summary: str # human-readable summary
Each error/warning includes the file path and a clear message, e.g.:
ERROR: questions/colonoscopy-optical-diagnosis/question_042/question.yaml — missing required field 'diagnosis'ERROR: questions/colonoscopy-optical-diagnosis/question_017/ — expected 2 images, found 1WARNING: questions/colonoscopy-optical-diagnosis/ — 78% of items have diagnosis 'adenoma' (distribution skew)
Files: new backend/app/features/teaching/validate.py, new backend/tests/test_teaching_validate.py
Step 2.4 — Teaching API router¶
Depends on 1.5, 2.1, 2.2, 2.3
Create backend/app/features/teaching/router.py and schemas.py.
All routes gated by Depends(requires_feature("teaching")).
Educator endpoints (require manage_teaching_content competency):
| Method | Path | Purpose |
|---|---|---|
| GET | /api/teaching/items |
List items in org's bank (filtered by question_bank_id) |
| POST | /api/teaching/items/sync |
Trigger sync from GCS bucket to database (runs validate) |
| POST | /api/teaching/items/validate |
Dry-run validation only — check bucket content without importing |
| GET | /api/teaching/results |
All assessment results for org (reporting) |
| POST | /api/teaching/results/email |
Trigger result email to coordinator address |
| GET | /api/teaching/question-banks |
List available question bank configs (from DB) |
| GET | /api/teaching/syncs |
List sync history for reporting/audit |
| PUT | /api/teaching/settings |
Update teaching org settings (coordinator email, institution) |
Candidate endpoints (require view_teaching_cases competency):
| Method | Path | Purpose |
|---|---|---|
| GET | /api/teaching/question-banks |
List question banks available to candidate (with titles, descriptions) |
| POST | /api/teaching/assessments |
Start new assessment (specify question_bank_id, randomly selects items) |
| GET | /api/teaching/assessments/{id} |
Get assessment state (progress, time remaining, question bank config) |
| GET | /api/teaching/assessments/{id}/current |
Get current unanswered item (signed image URLs, option labels from config) |
| POST | /api/teaching/assessments/{id}/answer |
Submit answer + score it immediately; returns next item inline |
| POST | /api/teaching/assessments/{id}/complete |
Finalise assessment, aggregate scored answers, return result |
| GET | /api/teaching/assessments/{id}/certificate |
Download PDF certificate (only if passed) |
| GET | /api/teaching/assessments/history |
List user's past assessments with results |
Assessment lifecycle:
POST /assessmentswith{"question_bank_id": "colonoscopy-optical-diagnosis"}→ loads config fromQuestionBankConfigtable, validates pool size ≥min_pool_size, createsAssessmentrow + NAssessmentAnswerrows (items randomly selected + ordered,selected_optionnull). Returns first item inline (signed image URLs + options).POST /assessments/{id}/answerwith{"selected_option": "high_confidence_adenoma"}→ validates time limit not exceeded, validatesselected_optionis a valid optionid, scores the answer immediately (setsselected_option,answered_at,is_correct,resolved_tags), then returns the next unanswered item inline (signed image URLs + options). If no more items remain, returns{"next_item": null, "all_answered": true}. This avoids a separateGET /currentround-trip.GET /assessments/{id}/current→ fallback endpoint: returns first unanswered item. Useful for resuming after a disconnect. Returns 404 if all answered.POST /assessments/{id}/complete→ aggregates the already-scoredAssessmentAnswerrows against the config'spass_criteria, setsscore_breakdown,is_passed,completed_at. Ifresults.email_notificationis enabled, sends result email to the org'scoordinator_email(fromTeachingOrgSettings). Returns full result breakdown.- No auto-complete on timeout: the server does not automatically complete the assessment when the timer expires. The frontend triggers
POST /completewhen the timer reaches zero. The server validatesnow ≤ started_at + time_limit_minutesonPOST /answercalls — any answer submitted after the time limit is rejected with HTTP 409. Unanswered items have noselected_optionand count againsttag_percentagerules (denominator is alwaystotal_items). GET /assessments/{id}/certificate→ generates a PDF certificate using the certificate background template from the question bank config (stored inQuestionBankConfig.config_yaml). Only available whenis_passed = trueandresults.certificate_download = truein the config. Returns 403 if not passed, 404 if certificates not enabled.
Register in main.py: app.include_router(teaching_router).
Files: new backend/app/features/teaching/router.py, new backend/app/features/teaching/schemas.py, backend/app/main.py
Phase 3: Teaching CBAC competencies¶
Parallel with Phase 2
Step 3.1 — Competencies and professions¶
Add to shared/competencies.yaml:
| Competency ID | Risk level | Description |
|---|---|---|
view_teaching_cases |
low | Take teaching assessments |
manage_teaching_content |
medium | Manage question bank items and view results |
view_teaching_analytics |
low | View aggregated assessment results |
Add to shared/base-professions.yaml:
| Profession | Base competencies |
|---|---|
learner |
view_teaching_cases |
educator |
view_teaching_cases, manage_teaching_content, view_teaching_analytics |
Step 3.2 — Question bank config loading¶
Depends on 3.1
Question bank configs (config.yaml per bank) are synced from the GCS bucket to the database. The sync process (POST /api/teaching/items/sync) pulls the config and question YAML files from the bucket, validates them against a Pydantic schema, and stores the parsed config in the QuestionBankConfig table. Images stay in the bucket and are served via signed URLs at runtime — only text/YAML data is persisted to the DB.
The GET /api/teaching/question-banks endpoint reads from the QuestionBankConfig table — no bucket or filesystem access at request time. When new materials are detected in the bucket (e.g. after a CI/CD deploy syncs the question bank repo to GCS), the sync process pulls the updated config and items.
The frontend generates types from the YAML competencies and professions (same existing pattern):
yarn generate:typesupdatessrc/generated/competencies.jsonandsrc/generated/base-professions.jsonto include the new teaching competencies and professions- Question bank metadata (titles, descriptions) is fetched from the API at runtime, not baked into the frontend build — this keeps the question bank repo decoupled from the frontend build pipeline
Files: frontend/scripts/generate-types.ts, frontend/src/generated/competencies.json (auto), frontend/src/generated/base-professions.json (auto)
Phase 4: Frontend — feature context and routing¶
Step 4.1 — Auth context and feature hook¶
Depends on 1.4
- Extend
Usertype infrontend/src/auth/AuthContext.tsxwithenabled_features: string[] - New
frontend/src/lib/features.ts:useHasFeature(key: string): booleanhook
Step 4.2 — RequireFeature guard¶
Depends on 4.1
New frontend/src/auth/RequireFeature.tsx — same pattern as existing RequirePermission.tsx. Returns 404 if feature not enabled (hides feature existence from users without access).
Step 4.3 — Teaching routes¶
Depends on 4.2
Add to frontend/src/main.tsx (inside authenticated children array), all wrapped in <RequireFeature feature="teaching">:
| Path | Page | Access |
|---|---|---|
/teaching |
Question bank selection + recent attempts | All teaching users |
/teaching/assessment/:id |
Active assessment (images, options, timer from config) | All teaching users |
/teaching/assessment/:id/result |
Assessment result breakdown (criteria from config) | All teaching users |
/teaching/history |
Full attempt history across all question banks | All teaching users |
/teaching/manage |
Question bank item management + sync trigger | Educators only |
/teaching/results |
All candidate results (central reporting) | Educators only |
Step 4.4 — Feature-aware navigation¶
Depends on 4.1
Modify frontend/src/components/navigation/SideNavContent.tsx:
- Import
useHasFeaturehook - If
useHasFeature("teaching")→ show "Teaching" nav section with sub-items: Assessments, My history - If user also has
manage_teaching_contentcompetency → show "Manage items", "Results" sub-items
Phase 5: Frontend — teaching UI¶
Step 5.1 — Components (Storybook-first)¶
Parallel with Phase 4
All in frontend/src/components/teaching/ with .stories.tsx and .test.tsx:
| Component | Purpose |
|---|---|
QuestionView |
Renders item based on question bank type: uniform → N images with config labels + config options; variable → item-provided images, text, and options. Optional text block (label from config) shown below images in both modes. |
AssessmentTimer |
Countdown timer (from config time_limit_minutes), visual warning at 5 min remaining |
AssessmentProgress |
Progress bar showing question X of N (from config items_per_attempt) |
AssessmentResult |
Pass/fail display with config-driven score breakdown |
ScoreBreakdown |
Per-criterion results: name, value, threshold, visual pass/fail (from pass_criteria) |
ItemManagementTable |
Educator view of synced items: status, metadata, image thumbnails. Items are synced from Git — no direct upload UI. Supports publish/unpublish toggle. |
AssessmentHistoryTable |
Table of past attempts with date, question bank, scores, pass/fail badge |
QuestionBankCard |
Card: question bank title, description, item count, "Start assessment" button |
AssessmentIntro |
Intro page before questions: renders title + markdown body from config + "Begin" button |
AssessmentClosing |
Closing page after last answer: renders title + markdown body from config + "View results" button |
CertificateDownload |
Download button for PDF certificate (shown only when passed + certificate_download enabled). Calls GET /assessments/{id}/certificate. |
Step 5.2 — Pages¶
Depends on 4.3, 5.1
All in frontend/src/features/teaching/pages/, using <Container size="lg"> wrapper:
| Page | Purpose |
|---|---|
AssessmentDashboard |
Grid of QuestionBankCards + recent attempts summary |
AssessmentAttempt |
Config-driven MCQ: intro page (from config) → QuestionView + AssessmentTimer + AssessmentProgress → closing page (from config) → results |
AssessmentResultPage |
Detailed result: ScoreBreakdown (per-criterion from config) + retry button |
AssessmentHistoryPage |
Complete attempt history with AssessmentHistoryTable |
ManageItems |
Educator view: synced items table (filtered by question bank), sync trigger button, validation status, publish/unpublish toggles |
SyncStatus |
Shows last sync result, validation errors if any, item counts per bank |
AllResults |
Educator view of all candidate results (filterable by question bank, CSV export) |
Phase 6: Infrastructure and CI/CD¶
Step 6.1 — Terraform¶
infra/environments/teaching/terraform.tfvarsis already configured (enable_fhir = false)- Add
TEACHING_STORAGE_BACKEND=gcsandTEACHING_GCS_BUCKETenv vars to Cloud Run backend config ininfra/main.tf - Verify
cloud_storagemodule provisions bucket for teaching images
Step 6.2 — Docker Compose (local dev)¶
- No new services needed (uses local filesystem storage)
- Add
TEACHING_STORAGE_BACKEND=localto backend env incompose.dev.yml - Add volume mount
./question-bank/questions:/question-banksto backend service - Add
TEACHING_QUESTION_BANK_PATH=/question-banksto backend env - Add
TEACHING_IMAGES_BASE_URL=/api/teaching/imagesto backend env - Add conditional
StaticFilesmount inmain.pyat/api/teaching/images/fromTEACHING_QUESTION_BANK_PATH(only whenTEACHING_STORAGE_BACKEND=local)
Step 6.3 — GitHub Actions¶
Discovery: Already implemented in
.github/workflows/deploy-staging-teaching.yml. Builds images, pushes to teaching AR, deploys to Cloud Run, runs smoke test. No changes needed for the main app deployment.
- Teaching deployment workflow (separate from EPR)
- Trigger: push to
main - Deploy to
quill-medical-teachingGCP project - Same Docker image build, different env vars
- Workload Identity Federation (per existing pattern)
Additionally, the quill-question-bank repo needs its own CI/CD (see Step 6.5) with a gsutil -m rsync step to push content to the GCS bucket on merge to main. This requires Workload Identity Federation configured for that repo too.
Step 6.4 — Seed data¶
Create dev-scripts/seed-teaching-data.sh:
- Create a teaching organisation
- Enable "teaching" feature on it
- Create sample educator and learner users
- Sync sample question bank items into the database (e.g. polyp WLI + NBI pairs with correct diagnoses for the colonoscopy bank)
- Mark items as published (need ≥
min_pool_sizefor a valid assessment)
Step 6.5 — Question bank repo CI/CD¶
Note: This step targets the separate
bailey-medics/quill-question-bankrepository and should be implemented there, not in this workspace.
The question bank repo (bailey-medics/quill-question-bank) gets its own CI/CD pipeline, separate from the main application. Content changes go through PR review (Git-based quality gate for clinical data), validation runs automatically, and validated content is synced to GCS on merge to main.
Branch protection¶
Same pattern as quillmedical — managed via Terraform in infra/github/branch_rules.tf:
- Protected branches ruleset:
mainrequires pull requests (no direct pushes), dismisses stale reviews, blocks force pushes and deletion - Branch naming ruleset: all non-protected branches must match
feature/*orhotfix/* - Add
quill-question-bankas a second repository in the existing Terraform config (new variable or separate resource blocks with the same rules)
Standalone validation script¶
The backend's validate.py uses only the standard library + PyYAML (no FastAPI, SQLAlchemy, or other backend dependencies). Rather than pulling the full backend into the question bank repo's CI, include a standalone copy of the validator:
- File:
scripts/validate.pyin the question bank repo - Dependencies: Python 3.13 + PyYAML (single
pip install pyyaml) - Logic: mirrors
backend/app/features/teaching/validate.pyfrom quillmedical — config schema checks, item-level validation (uniform + variable), cross-item checks (pool size, answer distribution) - CLI interface:
python scripts/validate.py questions/— discovers allquestions/*/config.yamlbanks and validates each one - Exit code: 0 if all banks pass, 1 if any errors
- Output: structured summary per bank (item count, errors, warnings), suitable for CI logs and PR comments
Keeping in sync: if the validation logic changes in the backend (e.g. new MCQ type, new config field), the standalone script must be updated too. A comment at the top of both files cross-references the other. Future improvement: extract into a shared Python package published to a private PyPI registry.
GitHub Actions workflows¶
Two workflows in .github/workflows/:
1. validate.yml — PR validation (feature branches + PRs)
name: Validate question banks
on:
push:
branches: ["feature/**"]
pull_request:
branches: [main]
Steps:
- Checkout repo (with LFS —
lfs: trueto pull image binaries) - Set up Python 3.13
pip install pyyaml- Run
python scripts/validate.py questions/ - On failure: validation errors in CI log block the PR merge
- On warnings: logged but non-blocking
This workflow is a required status check on the main branch ruleset, so PRs cannot merge if validation fails. This is the quality gate for clinical content — incorrect diagnoses, missing images, or malformed YAML are caught before they reach the GCS bucket.
2. deploy.yml — GCS sync (merge to main)
name: Deploy to GCS
on:
push:
branches: [main]
Steps:
- Checkout repo (with LFS)
- Set up Python 3.13 + validate (belt-and-braces — re-validate even though PR already passed, in case of merge conflicts or manual main commits)
- Authenticate to GCP via Workload Identity Federation (same pattern as quillmedical's staging/teaching deploy)
gsutil -m rsync -r -d questions/ gs://$TEACHING_GCS_BUCKET/questions/— mirror thequestions/directory to the bucket. The-dflag deletes files from the bucket that no longer exist in the repo (keeps bucket in sync with Git).- Slack notification on success/failure
GCP setup required:
- Workload Identity Federation pool + provider for
bailey-medics/quill-question-bankrepo (same teaching GCP project) - Service account with
roles/storage.objectAdminon the teaching GCS bucket - Repository secrets:
GCP_TEACHING_WIF_PROVIDER,GCP_TEACHING_SERVICE_ACCOUNT,GCP_TEACHING_GCS_BUCKET,SLACK_WEBHOOK_URL
Sync flow (end to end)¶
Educator creates/updates question bank content
│
▼
Feature branch + PR to main
│
▼ validate.yml (CI)
│
Validation passes? ──── No ──→ PR blocked, fix errors
│
Yes
│
▼
PR merged to main
│
▼ deploy.yml
│
gsutil rsync → GCS bucket
│
▼
Educator triggers sync in teaching app
(POST /api/teaching/items/sync)
│
▼
Backend reads YAML from bucket → DB
Images served via signed URLs from CDN
The two-step process (GCS sync + app sync) is intentional: GCS holds the validated content as a staging area, and the app sync is a deliberate educator action that imports the content into the database. This prevents automatic database changes from unexpected content updates.
Step 6.6 — Justfile question bank commands¶
Add Justfile commands for local development with the question bank repo:
question-bank-clone: # git clone bailey-medics/quill-question-bank into question-bank/
question-bank-pull: # git -C question-bank pull
question-bank-push: # git -C question-bank push
Also add question-bank/ to .gitignore so the cloned content stays out of the main application repo.
Phase 7: Testing¶
Backend¶
OrganisationFeaturemodel CRUDrequires_featuredependency (enabled, disabled, no-org cases)- Question bank config loading + validation (valid YAML, required fields, option tag consistency)
- Validation tool: structural checks (missing config, missing question.yaml, wrong image count, stray files), content checks (invalid metadata values, missing required fields, bad option IDs), cross-item checks (pool size, duplicate IDs, answer distribution warnings)
- Validation dry-run endpoint returns structured errors/warnings without importing
- Item sync endpoints (sync from repo, validate-then-import, reject on validation errors)
- Assessment lifecycle (start → answer+score → complete → aggregate)
- Per-answer scoring:
is_correctandresolved_tagsset on eachPOST /answer - Scoring engine: test
tag_percentageandtag_accuracyrules independently, compound criteria - Edge cases: time expiry mid-assessment (answers rejected after limit), resume after disconnect, pool <
min_pool_size - Storage backends (local + GCS mock)
- Config with
CLINICAL_SERVICES_ENABLED=false(app starts without clinical services) - Result email notification to coordinator address (from
TeachingOrgSettings) - Certificate PDF generation (background template + text overlay)
- Sync history:
QuestionBankSyncrecords created on sync, errors/warnings persisted QuestionBankConfigloaded from GCS bucket and stored in DB
Frontend¶
useHasFeaturehookRequireFeatureguard (show/hide routes)QuestionViewcomponent (renders based ontype:uniform→ config-driven images/options;variable→ per-item images/options; optional text in both modes)AssessmentTimer(countdown from configtime_limit_minutes, expiry handling)AssessmentProgress(question X of N from config)AssessmentResult+ScoreBreakdown(config-driven criteria display, pass/fail)QuestionBankCard(title, description, item count from config)- All teaching components (Storybook stories + test files)
- Teaching pages (render with mock API data + mock config)
- Navigation conditional rendering
Integration¶
- Full flow: educator syncs items from GCS bucket → candidate selects question bank → starts assessment → answers N items (each scored immediately) → completes assessment → aggregate scoring runs → pass/fail displayed → email sent → certificate downloadable
- Scoring engine: test each rule type with different configs (single criterion, compound criteria, different thresholds)
- Per-answer audit: verify
is_correctandresolved_tagspersisted on each answer submission - Timer expiry: answers rejected after time limit, assessment remains open until explicit
POST /complete, unanswered items penalisetag_percentagerules - Feature gating: teaching routes 403 when feature disabled
- EPR routes still work when teaching feature enabled alongside
- Certificate: PDF generated with correct background template, text areas populated, only available when passed
Verification checklist¶
just start-dev b— app starts withCLINICAL_SERVICES_ENABLED=falsejust start-dev b— app starts normally with defaults (FHIR/EHRbase enabled, EPR mode)just unit-tests-backend— all existing and new tests passjust unit-tests-frontend— all existing and new tests passjust storybook— teaching components render correctlyjust pre-commit— mypy strict, ruff, eslint all pass- Manual: teaching org → enable feature → set coordinator email + institution → educator syncs items for colonoscopy bank → candidate selects bank → starts assessment → answers 120 → compound score correct → result email received → certificate downloads
- Manual: EPR routes return 503 (not crash) when FHIR/EHRbase disabled
- Manual: teaching nav hidden for EPR-only organisations
- Manual: new org has zero features until admin explicitly enables them
terraform plan -var-file=environments/teaching/terraform.tfvars— no unexpected diffs- Validation: run validator against a valid question bank → passes. Run against a bank with missing images, bad metadata, missing config → returns correct errors.
- Manual: per-answer
is_correctandresolved_tagspersisted on each submitted answer - Manual: answers rejected with 409 after time limit expires; assessment not auto-completed
- Manual: sync history visible on SyncStatus page after running sync
Decisions¶
- Additive only: existing EPR code stays in place. New optional functionality goes in
features/. Teaching code goes inapp/features/teaching/(backend) andsrc/features/teaching/(frontend). Existing code (e.g. messaging) will be migrated intofeatures/later. - Config-driven question banks: each assessment type is a YAML
config.yamlin the external question bank repo (bailey-medics/quill-question-bank) defining MCQ type, images, options, tags, scoring rules, and assessment parameters. The engine is generic; adding a new question bank = new YAML file + content, validated and synced. No code changes required. - MCQ types:
uniform(fixed structure per item — same images + options, defined in config) vsvariable(each item has its own images, text, and options). Same scoring engine, different rendering and upload flows. New types can be added without schema changes (just a new frontend renderer + upload form). - All-explicit feature model: no
OrganisationFeaturerows = no access. Row existence = feature enabled; deleting the row disables it. Noenabledboolean — avoids ambiguity. Data migration seeds existing orgs withepr,messaging,lettersrows. - EPR-safe defaults:
CLINICAL_SERVICES_ENABLED: bool = True— clinical services (FHIR + EHRbase) are on by default. Teaching deployment must explicitly set= false. A missing env var never silently disables clinical functionality. - Belt-and-braces: infrastructure flag (
CLINICAL_SERVICES_ENABLED) controls whether services start; feature rows (OrganisationFeature) control whether users can access features. Both layers must agree. - Storage: GCS for cloud, local filesystem for dev. No MinIO. Abstract
StorageBackendinterface allows future backends. - Single migration history: teaching tables always created everywhere (dormant in EPR env). No Alembic branching.
- Feature keys: string constants (
"epr","teaching","messaging","letters"), not an enum — extensible without migrations. - Route prefixes:
/api/teaching/*(backend),/teaching/*(frontend). - Item status:
draft/publishedfield — only published items enter the random selection pool for assessments. - Tag-based scoring: option tags drive the scoring engine.
tag_percentageandtag_accuracyrules are composable — any number of criteria, any combination of tags. Works identically foruniform(tags from config) andvariable(tags from per-item options). New rule types (e.g.overall_accuracy,minimum_correct_count) can be added without schema changes. - Random selection: each assessment randomly draws N items from the published pool per config. Order is also randomised — no two attempts are identical.
- Timer: server-validated, duration from config. The assessment is not auto-completed on timeout. The frontend triggers
POST /completewhen the timer reaches zero.POST /answercalls are rejected with HTTP 409 after the time limit. Unanswered items count againsttag_percentagerules (denominator is alwaystotal_items). - Per-answer scoring: each answer is scored immediately on submission (
POST /answer).is_correctandresolved_tagsare persisted per-row onAssessmentAnswer. Assessment completion (POST /complete) aggregates already-scored answers — no re-evaluation. This provides a complete audit trail for rebuttal or retrospective item correction. - Metadata validation: item metadata is validated against the question bank config at sync time — prevents orphan data that the scoring engine can't evaluate. The validation tool also runs as a pre-sync check and can be triggered independently (dry-run).
- Content lives in a separate repo: question bank configs, images, and item metadata live in
bailey-medics/quill-question-bank, not in the main application repo. This keeps large binaries out of the app codebase. Content is synced to GCS (production) or local filesystem (dev) and imported to the database via the sync command. Educators submit content via Git PRs — no direct web upload. - Question bank validation: a dedicated validation tool checks repo structure (config presence, subfolder naming, image counts, YAML schema, metadata values) before any sync. Runs in CI on the question bank repo, as a dry-run endpoint, and as the first step of every sync.
- Config versioning: config changes (e.g. new pass criteria thresholds, updated options) only take effect with a
versionbump inconfig.yaml. TheAssessment.bank_versionfield ties each assessment to the config version that was active when it started. In-progress assessments are always scored against the config version they were created with (read fromQuestionBankConfigtable by version). This means two assessments running concurrently can use different configs if a version bump happened between their starts. - Answer + next item combined:
POST /assessments/{id}/answerscores the submitted answer and returns the next unanswered item inline, avoiding a separateGET /currentround-trip.GET /currentremains as a fallback for resuming after disconnect. - Certificate generation: PDF certificates use a background template (
certificate_background.pdf) from the question bank repo, with configurable text areas for candidate name, date, institution, and score. Generated server-side usingreportlab(orPyPDF2for overlay). Only available for passed assessments withresults.certificate_download = true. - Coordinator email and institution: stored per-org in the
TeachingOrgSettingstable, not in the question bank config. The admin UI provides fields to set these. The email is used for result notifications; the institution name appears on certificates. - Sync history: every sync operation creates a
QuestionBankSyncrow with status, item counts, errors, and warnings. TheSyncStatuspage reads from this table. Sync history persists across server restarts. - Config stored in DB: question bank configs are pulled from the GCS bucket during sync and stored in the
QuestionBankConfigtable. The API reads configs from the DB at request time — no bucket or filesystem access needed for serving question bank metadata.
Teaching deployment flow¶
How a teaching-only environment goes from zero to working:
- Terraform provisions GCP project —
terraform apply -var-file=environments/teaching/terraform.tfvarscreates Cloud Run service, Cloud SQL (Postgres), GCS bucket. No HAPI FHIR or EHRbase services. - Same Docker image deploys — identical backend image to EPR, but env vars set
CLINICAL_SERVICES_ENABLED=false,TEACHING_STORAGE_BACKEND=gcs,TEACHING_GCS_BUCKET=quill-teaching-images. - App starts without clinical services — FHIR/EHRbase clients are not initialised. Clinical routes return 503. Teaching routes are available.
- Admin creates organisation — e.g. "Gastroenterology MCQs" via admin UI or API. Enables
teachingfeature on it. Does NOT enableepr,messaging, orletters. Sets coordinator email and institution name inTeachingOrgSettings. - Admin creates users — educator and learner accounts, assigned to the teaching org with appropriate professions (
educator/learner). - Educator syncs items — triggers sync via admin UI or
just sync-question-bank colonoscopy-optical-diagnosis. The sync pulls config and question YAML files from the GCS bucket (where CI/CD already deployed them), validates the content, and imports item metadata into the database. Images stay in the bucket and are served via signed URLs. - Users see teaching-only UI — no EPR nav items, no patient demographics, no clinical letters. Only assessment dashboard listing available question banks, MCQ assessments, scoring results, and attempt history.
Open considerations¶
- Multi-org users: a user could belong to both an EPR org and a teaching org.
requires_featureshould check the user's active org context, not just primary. May need org-switching UI in future. - Central reporting: results emailed to the coordinator address (from
TeachingOrgSettings) for accreditation tracking. May need structured export (CSV) for integration with accreditation bodies. TheAllResultspage provides the educator-facing view; the email provides the external audit trail. - Self-registration: the spec describes users creating their own accounts (Name, Institution, Work Email). The current system uses admin-created accounts. Options: (a) self-registration endpoint for teaching orgs with email verification, (b) admin creates accounts and sends invitation links, (c) open registration with auto-assignment to teaching org. Decision deferred — start with admin-created accounts.
- Timer behaviour: if the candidate's browser closes mid-assessment, the assessment remains open and resumable within the time limit.
GET /assessments/{id}/currentpicks up where they left off. The server rejectsPOST /answercalls after the time limit with HTTP 409. The frontend triggersPOST /completewhen the timer reaches zero. If the candidate never returns, the assessment remains in an incomplete state (nocompleted_at, noscore_breakdown) — it can be viewed by educators in reporting but does not count as a pass or fail. - Pool size safety: if the item pool has fewer than
min_pool_sizepublished items, refuse to start an assessment (HTTP 409). Educator management page should show a warning banner when pool is below threshold. - EPR document storage: the same
StorageBackendabstraction could later serve EPR binary documents (clinical scans, letters). Out of scope for this plan. - Educator analytics: future work could add cohort-level analytics (commonly misclassified items, confidence calibration curves, pass rates over time) — generic across all question banks.
- Future question bank examples: the same engine could host radiology image classification (
uniform), medication safety MCQs (variable— text-only, no images), dermatology lesion assessment (uniform— single image), or mixed clinical scenarios (variable— varying images + text per question) — each as a new YAML file with the appropriatetype.