Human code review plan¶
Systematically review all LLM-generated code in risk-prioritised order: security-critical backend first, then healthcare data handling, then business logic, then frontend auth/API, then UI components. Each file gets specific review focus areas. ~40 backend files + ~50 key frontend files across 7 phases.
Principles¶
- Risk-first: Security and healthcare-critical code reviewed before UI
- Outside-in: Start from system boundaries (auth, API, external integrations) then move inward
- Tests alongside code: Review the test for each file immediately after reviewing the file itself — catch gaps while context is fresh
- One phase at a time: Complete and sign off each phase before starting the next
- Track progress: Check off files as reviewed in the plan document
Phase 1: Security core (highest risk)¶
Review the code that protects the entire system. A bug here affects everything.
Files¶
- [ ]
backend/app/security.py(~350 lines) - Focus: Argon2 password hashing params (memory cost, time cost), JWT algorithm and claims validation, token expiry enforcement, TOTP implementation against RFC 6238, CSRF token generation and validation (itsdangerous signing), no PHI in tokens
- Test:
backend/tests/test_security.py - [ ]
backend/app/config.py(~120 lines) - Focus: All
SecretStrfields actually markedSecretStr, no secrets with defaults that could leak to prod, database URL construction (no injection), environment variable completeness - Test:
backend/tests/test_config.py - [ ]
backend/app/system_permissions/permissions.py(~70 lines) - Focus: Permission hierarchy is correct and complete,
check_permission_level()cannot be bypassed, no off-by-one in hierarchy - Also review:
backend/app/system_permissions/decorators.py— FastAPIDepends()gates raise 403 correctly - Also review:
backend/app/system_permissions/__init__.py— exports are correct - [ ]
backend/app/cbac/decorators.py(~80 lines) - Focus:
has_competency()dependency actually blocks access (not just logs), high-risk competencies are audit-logged, competency resolution logic is correct - Also review:
backend/app/cbac/competencies.pyandbackend/app/cbac/base_professions.py - Test: related CBAC tests
- [ ]
backend/app/db/auth_db.py(~60 lines) - Focus: Connection pooling params,
pool_pre_ping, session lifecycle (no leaked sessions), engine disposal - Test:
backend/tests/test_db.py
Phase 1 checklist¶
- [ ] No hardcoded secrets or default passwords
- [ ] All crypto uses standard libraries (no hand-rolled crypto)
- [ ] JWT validation checks expiry, issuer, algorithm
- [ ] TOTP window is tight (1-2 intervals max)
- [ ] Permission checks are deny-by-default
- [ ] CBAC resolution formula correct:
base + additional - removed - [ ] No SQL injection vectors (parameterised queries only)
- [ ] Error messages don't leak internal state
Phase 2: API routes and input validation¶
The main attack surface — every endpoint needs correct auth, input validation, and authorisation.
Files¶
- [ ]
backend/app/main.py(~350 lines) - Focus: Every mutating endpoint has CSRF protection, every endpoint has appropriate auth dependency, correct HTTP methods used, rate limiting on login/register, error responses don't leak PHI, CORS configuration is correct
- Test:
backend/tests/test_auth.py - [ ]
backend/app/schemas/auth.py(~60 lines) - Focus: Password validation (min length, complexity?), username sanitisation, email validation,
extra='forbid'on all models, noAnytypes - [ ]
backend/app/schemas/cbac.py(~70 lines) - Focus: Competency IDs validated against known set, no arbitrary strings accepted
- [ ]
backend/app/schemas/messaging.py(~120 lines) - Focus: Message content length limits, conversation access properly scoped
- [ ]
backend/app/schemas/letters.py(~25 lines) - Focus: Letter body sanitised (Markdown injection?), author validated
- [ ]
backend/app/schemas/features.py(~25 lines)
Phase 2 checklist¶
- [ ] Every route has auth dependency (no accidentally public endpoints)
- [ ] All mutating endpoints require CSRF
- [ ] Input validation at API boundary (Pydantic models with constraints)
- [ ] Admin endpoints check admin/superadmin permission
- [ ] Error responses use safe HTTP status codes (no 500 leaks)
- [ ] No mass assignment vulnerabilities (
extra='forbid')
Phase 3: Healthcare data handling¶
Clinical data — correctness is patient safety. FHIR/OpenEHR integrations must be robust.
Files¶
- [ ]
backend/app/fhir_client.py(~370 lines) - Focus: Patient data mapping correctness (NHS number format, name parsing, DOB), FHIR resource creation follows spec, error handling for unavailable FHIR server, no PHI in logs, avatar gradient extension is non-clinical and safe to lose
- Tests:
backend/tests/test_fhir_client.pyandtest_fhir_integration.py - [ ]
backend/app/ehrbase_client.py(~270 lines) - Focus: EHR creation idempotency (
get_or_create_ehr), subject_id validation, letter composition structure follows OpenEHR templates, Basic Auth credentials not logged, error handling - Tests:
backend/tests/test_ehrbase_client.pyandtest_ehrbase_integration.py - [ ]
backend/app/messaging.py(~280 lines) - Focus: CQRS pattern correctness (FHIR write then SQL projection), org snowballing logic, message access control (can user A see user B's messages?), participant management, no PHI in SQL projection beyond what's needed
- Test:
backend/tests/test_messaging.py - [ ]
backend/app/organisations.py(~140 lines) - Focus: Org membership queries are correct (no data leakage across orgs),
check_user_patient_access()is deny-by-default, shared org calculation - Test: related tests
- [ ]
backend/app/patient_records.py(~90 lines) - Flag as deprecated: Confirm file-based storage is actually unused, or mark for removal
Phase 3 checklist¶
- [ ] FHIR resources validate against FHIR R4 spec
- [ ] EHRbase templates are correct OpenEHR archetypes
- [ ] No PHI in application logs
- [ ] Error states fail safely (deny access, not grant)
- [ ] Idempotent operations are actually idempotent
- [ ] Cross-org data isolation is enforced
Phase 4: Data models, migrations, and teaching feature¶
Database schema correctness and the teaching module business logic.
Files¶
- [ ]
backend/app/models.py(~750 lines) - Focus: All relationships correct (FK constraints, cascade deletes),
Mapped[Type]annotations match actual DB types, no missing indexes on frequently queried columns, sensitive fields appropriately typed, password column length sufficient for Argon2 hash - Test:
backend/tests/test_models.py - [ ]
backend/alembic/versions/(19 migrations) - Focus: Migration chain is unbroken, no data-loss migrations, FK constraints present, indexes on join columns
- Spot-check 3-4 key migrations: init, CBAC, messaging, orgs
- [ ]
backend/app/features/teaching/router.py(~350 lines) - Focus: Feature gate (
requires_feature("teaching")) on all routes, educator routes require competency, assessment creation validates bank exists, scoring endpoint validates assessment ownership - [ ]
backend/app/features/teaching/scoring.py(~90 lines) - Focus: Scoring algorithm correctness, edge cases (empty answers, all wrong, all right), pass criteria evaluation
- [ ]
backend/app/features/teaching/validate.py(~220 lines) - Focus: Validation catches malformed YAML, path traversal in image refs, handles missing files gracefully
- [ ]
backend/app/features/teaching/sync.py(~170 lines) - Focus: Upsert logic (no duplicates), audit trail recorded, handles partial sync failures
- [ ]
backend/app/features/teaching/storage.py(~120 lines) - Focus: GCS bucket access is read-only, local storage doesn't escape intended directory (path traversal), temp file cleanup
- [ ]
backend/app/features/teaching/models.py(~180 lines) - Also review:
backend/app/features/teaching/schemas.py - Tests:
backend/tests/test_teaching_*.py
Phase 4 checklist¶
- [ ] No path traversal in file operations
- [ ] Database constraints match business rules
- [ ] Scoring is deterministic and correct
- [ ] Feature gates are not bypassable
- [ ] Migrations are reversible
Phase 5: Frontend security and data handling¶
Frontend auth, API communication, and data-sensitive code.
Files¶
- [ ]
frontend/src/lib/api.ts(~160 lines) - Focus: 401 retry logic doesn't loop infinitely, CSRF token attached to mutating requests,
credentials: 'include'on all requests, no token stored in localStorage, error handling doesn't expose internals - [ ]
frontend/src/auth/AuthContext.tsx - Focus: Auth state transitions are correct, logout clears all state, no stale auth after session expiry, reload doesn't race
- [ ]
frontend/src/auth/RequireAuth.tsx - Focus: Unauthenticated users redirected (not shown content briefly), no flash of protected content
- [ ]
frontend/src/auth/RequirePermission.tsx - Focus: Permission check is deny-by-default, hierarchy comparison matches backend
- [ ]
frontend/src/auth/GuestOnly.tsx - Focus: Authenticated users redirected away from login
- [ ]
frontend/src/auth/RequireFeature.tsx - Focus: Feature check correct
- [ ]
frontend/src/main.tsx - Focus: All routes wrapped in correct guards, no unprotected admin routes, route definitions match expected permissions
- [ ]
frontend/src/lib/cbac/hooks.ts - Focus: Competency checks match backend logic,
useHasCompetencydoesn't default totrue - [ ]
frontend/src/types/cbac.ts - Focus: Types match backend schema, type guards are correct
- [ ]
frontend/src/lib/fhir-patient.ts - Focus: Patient data extraction handles missing fields, no crash on malformed FHIR response
- [ ]
frontend/src/RootLayout.tsx - Focus: Patient context provider, layout structure
Phase 5 checklist¶
- [ ] No tokens or credentials in localStorage/sessionStorage
- [ ] API client uses httpOnly cookies only
- [ ] Auth guards are deny-by-default
- [ ] No XSS vectors (React handles most, but check
dangerouslySetInnerHTML) - [ ] Permission/CBAC checks mirror backend (defence in depth)
- [ ] Route guards cannot be bypassed by direct URL navigation
Phase 6: Frontend pages and components (lower risk)¶
UI code — focus on correctness and UX rather than security (already gated by Phase 5 guards). Spot-check key pages rather than all 34.
Priority pages (data mutations or PHI)¶
- [ ] Login/Register pages (credential handling)
- [ ] Admin user management pages (permission assignment)
- [ ] Patient creation/messaging pages (PHI display)
- [ ] Teaching assessment pages (scoring display)
Priority components (sensitive data)¶
- [ ]
Messagingcomponent (message content) - [ ]
LetterView(clinical content rendering — check for XSS in Markdown) - [ ]
Demographics(PHI display) - [ ]
MarkdownView(sanitisation) - [ ]
MultiStepForm(form submission)
Phase 6 checklist¶
- [ ] No
dangerouslySetInnerHTMLwithout sanitisation - [ ] Markdown rendering is sanitised (XSS via Markdown)
- [ ] Forms validate before submission
- [ ] Error states handled (loading, error, empty)
- [ ] No PHI in
console.logstatements - [ ] Consistent UI patterns (Container, responsive breakpoints)
Phase 7: Infrastructure and configuration¶
Files¶
- [ ]
compose.dev.yml— Hardcoded passwords (acceptable for dev?), network isolation, health checks - [ ]
Justfile— Command correctness, no destructive commands without confirmation - [ ]
caddy/prod/Caddyfile— Security headers complete, CSP policy, HSTS, caching - [ ]
shared/*.yaml— Competency definitions, profession templates, jurisdiction config (clinical accuracy — may need clinical safety officer input) - [ ]
infra/(Terraform) — Spot-check secrets management, IAM, firewall rules - [ ] Dev scripts — Guard mechanism works, no accidental prod access
Phase 7 checklist¶
- [ ] Dev credentials not usable in production
- [ ] Prod security headers are comprehensive
- [ ] Clinical competency definitions are clinically accurate
- [ ] Terraform secrets use Secret Manager (no plaintext)
- [ ] Destructive dev scripts have safety guards
Review workflow per file¶
- Read the file end-to-end
- Check against the focus areas listed for that phase
- Add inline comments for anything unclear or suspicious
- Review the corresponding test file — are edge cases covered?
- Run the tests if changes are made
- Mark as reviewed with initials and date
Key decisions¶
- Review order follows risk: security, API, healthcare, models, frontend auth, UI, infra
- Tests reviewed alongside source files, not in a separate pass
- Deprecated code (
patient_records.py) flagged for removal rather than deep review - Infrastructure reviewed last because it's less likely to have LLM-introduced bugs
- Shared YAML files may need clinical safety officer review for clinical accuracy, not just code review
- Phase 6 is spot-check only — reviewing all 34 pages and 180+ components isn't practical; focus on data-handling ones