Security audit and penetration testing plan¶
Comprehensive white-hat security audit of Quill Medical — a healthcare SPA with FastAPI backend, React frontend, FHIR/OpenEHR integrations, and Docker infrastructure. The audit covers code review (Phase 1) then automated + manual testing (Phase 2). One confirmed vulnerability found during discovery (unsanitised Markdown→HTML in email templates). Overall security posture is strong — Bandit SAST and Dependabot/Renovate dependency scanning are already in CI. Remaining gap: no secret scanning in CI.
Phase 1: Code review (static analysis)¶
1A. ~~Confirmed vulnerability — unsanitised Markdown in email templates~~ FIXED¶
- File:
backend/app/features/teaching/email_templates.py(line ~111) - Issue:
markdown.markdown(body_md)output was used directly without HTML sanitisation - Fix applied: Added
nh3.clean()aroundmarkdown.markdown()output; addednh3dependency - Tests: Added
test_sanitises_malicious_html_in_bodyandtest_preserves_safe_html_after_sanitisation - Severity: Medium (email clients often strip scripts, but not guaranteed)
1B. ~~Authentication and session management review~~ REVIEWED AND FIXED¶
- Files:
backend/app/security.py,backend/app/main.py(auth endpoints),backend/app/schemas/auth.py
Confirmed secure (no action needed):
- JWT algorithm pinned:
algorithms=[settings.JWT_ALG]— no"none"confusion possible - JWT secret enforces
min_length=32via Pydantic field constraint - Cookie flags correct: HttpOnly, SameSite=lax, Secure=True in production configs
- Login returns generic "Invalid credentials" — no account enumeration via login
- Forgot-password always returns
{"detail": "ok"}— no account enumeration - CSRF double-submit cookie validates header + cookie match + signature
- Refresh token scoped to
/api/auth/refreshpath (minimal exposure) - Argon2id password hashing with OWASP-recommended defaults
Vulnerabilities found and fixed:
| # | Severity | Issue | Fix |
|---|---|---|---|
| 1 | Medium | No session invalidation on password change/reset — old refresh tokens valid for 7 days | Added token_version column to User model; included tv claim in access + refresh JWTs; current_user and refresh dependencies reject stale tokens; change_password and reset_password increment token_version |
| 2 | Medium | Account enumeration via registration — distinct "Username already exists" / "Email already exists" errors | Changed both messages to generic "Username or email already in use" |
| 3 | Low | Inconsistent password minimum: register=6 chars, change/reset=8 chars | Standardised to 8 characters across all endpoints |
| 4 | Low | TOTP disable didn't require password re-entry — session-hijack could silently disable 2FA | Added TotpDisableIn schema requiring password field; endpoint now verifies password before disabling |
| 5 | Low | Missing rate limiting on TOTP setup/verify/disable endpoints | Added @limiter.limit("5/minute") to all three TOTP endpoints |
| 6 | Info | Auth schemas missing extra='forbid' (Pydantic defence-in-depth) |
Added model_config = ConfigDict(extra="forbid") to all auth schemas |
Remaining informational items (accepted risk):
- No TOTP replay protection: Same 6-digit code can be reused within the 30-second window. Low risk: requires valid session + code + CSRF, and the window is very short. Server-side used-code tracking would add Redis complexity disproportionate to the risk.
- Logout doesn't invalidate tokens server-side: Clears cookies only. With 15-minute access token TTL and token_version on password change, the exposure window is minimal.
1C. ~~Authorisation and access control review~~ REVIEWED AND FIXED¶
- Files:
backend/app/main.py,backend/app/push.py,backend/app/messaging.py,backend/app/cbac/
Confirmed secure (no action needed):
- All messaging endpoints enforce participant-based access control via
check_user_patient_access - External patient invites properly validate granter authority (patient-self or admin)
- Organisation list/detail endpoints appropriately restrict admin-only operations
- Conversation detail returns 404 (not 403) for unauthorised access — prevents IDOR enumeration
- Teaching feature endpoints gated by feature flags and organisation membership
Vulnerabilities found and fixed:
| # | Severity | Issue | Fix |
|---|---|---|---|
| 1 | Critical | PATCH /cbac/my-competencies let any authenticated user modify their own competencies — could grant clinical or admin-level access |
Added admin/superadmin permission check + CSRF protection |
| 2 | Medium | POST /users, PATCH /users/{user_id} (admin endpoints) missing CSRF protection |
Changed from DEP_CURRENT_USER to DEP_REQUIRE_CSRF |
| 3 | Medium | PUT /organizations/{org_id}, POST /organizations, POST /organizations/{org_id}/staff missing CSRF protection |
Changed from DEP_CURRENT_USER to DEP_REQUIRE_CSRF |
| 4 | Medium | POST /patients, PATCH /patients/{patient_id}, POST /patients/{patient_id}/deactivate, POST /patients/{patient_id}/activate missing CSRF protection |
Added DEP_REQUIRE_CSRF to dependencies |
| 5 | Low | POST /push/subscribe had no authentication — any anonymous client could register push subscriptions |
Added cookie-based authentication check |
Remaining informational items:
POST /patientsallows any authenticated user to create patients (no role check). This may be intentional for certain workflows. Consider addingDEP_REQUIRE_ROLES_CLINICIANif only clinicians should create patients.PATCH /patients/{patient_id}similarly allows any authenticated user to edit demographics. Consider role restriction.POST /accept-invitesetssystem_permissionsfrom the invite token'suser_typefield. The token is cryptographically signed, so this is safe from manipulation, but theuser_typevalues ("external_hcp", "patient_advocate") aren't in the standard permission hierarchy.
1D. ~~Input validation and injection review~~ REVIEWED AND FIXED¶
- Files:
backend/app/schemas/(all),backend/app/main.py(inline schemas),backend/app/messaging.py,backend/app/ehrbase_client.py,backend/app/fhir_client.py
Confirmed secure (no action needed):
- FHIR queries use object-based construction via
fhirclientlibrary — no injection risk - Frontend renders Markdown with DOMPurify sanitisation — stored XSS mitigated on output
- Email sending via Resend API (structured params, not raw SMTP) — header injection mitigated
Vulnerabilities found and fixed:
| # | Severity | Issue | Fix |
|---|---|---|---|
| 1 | Medium | Multiple input schemas missing extra='forbid' — LetterIn, CompetencyCheck, UpdateCompetenciesRequest, PrescriptionRequest, FeatureToggleIn, AdminUserCreateIn, AdminUserUpdateIn, TotpVerifyIn, FHIRPatientCreateIn |
Added model_config = {"extra": "forbid"} / ConfigDict(extra="forbid") to all request schemas |
| 2 | Low | Email fields in RegisterIn and ForgotPasswordIn used plain str — no RFC validation or header injection checks |
Changed to EmailStr (Pydantic + email-validator) for RFC-compliant validation |
Remaining informational items:
- Message body not sanitised server-side: Server stores raw Markdown. DOMPurify on the frontend handles XSS. Defence-in-depth suggests server-side sanitisation too, but this is low risk given the existing frontend protection. Could add
nh3.clean()on message body before storage as a future improvement. - Letter content not sanitised server-side: Same as messages — stored as raw Markdown in EHRbase, rendered safely by frontend DOMPurify.
- Password complexity: Only length is validated (min 8 chars). No uppercase/digit/special character requirement. This is a trade-off — NIST SP 800-63B recommends length over complexity rules. Current approach is acceptable.
1E. ~~Infrastructure and configuration review~~ REVIEWED — NO FIXES NEEDED¶
- Files:
compose.dev.yml,caddy/prod/Caddyfile, Dockerfiles,backend/app/config.py
Confirmed secure (all areas):
- CSP: Strict policy —
default-src 'self',script-src 'self',frame-ancestors 'none'. Image exceptions for GCS only. - HSTS: 2-year max-age with includeSubDomains and preload.
- Security headers: X-Frame-Options DENY, X-Content-Type-Options nosniff, Referrer-Policy strict-origin-when-cross-origin, Permissions-Policy disables camera/microphone/geolocation, Server header removed.
- Docker security: Non-root user (
appuserUID 10001), multi-stage builds, slim/alpine base images, digest-pinned for reproducibility. - Network isolation: Backend on private network only (not mapped to host); only Caddy/frontend exposed.
- Secrets:
.envin.gitignore, production secrets via GCP Secret Manager,JWT_SECRETenforcesmin_length=32. - CORS: Default
["*"]in dev, production set via Cloud Run environment variables. - SECURE_COOKIES: Default
falsein dev, explicitlytrueincompose.prod.cloud-run.ymlandinfra/main.tf. - Rate limiting (infrastructure): GCP Cloud Armor WAF handles global rate limiting in production.
Informational notes:
- Dev compose has a hardcoded Postgres password (
vspct8I2iLUkfC3JsXxefy). This is development-only; the DB is on the private Docker network and not exposed to the host. Acceptable for dev. - Application-level rate limiting covers auth endpoints (login 5/min, register 3/min, forgot-password 3/min, TOTP 5/min). Other endpoints rely on Cloud Armor in production.
1F. ~~Dependency security audit~~ REVIEWED — MIGRATION RECOMMENDED¶
- Files:
backend/pyproject.toml,frontend/package.json
Confirmed in place:
- Dependabot vulnerability alerts (weekly scan of pip, npm, Docker, Terraform, GitHub Actions) + Renovate version-bump PRs with 3-tier severity policy and immediate alerts for critical/high CVEs
pip-audit findings (31 CVEs across 9 packages):
| Package | Installed | CVEs | Fix Version | Impact |
|---|---|---|---|---|
| starlette | 0.40.0 | CVE-2025-54121, CVE-2025-62727 | 0.47.2+ | ASGI framework (via FastAPI) — update via Renovate |
| ecdsa | 0.19.1 | CVE-2024-23342 (timing attack) | No fix available | Used only by python-jose — migration to PyJWT removes this |
| aiohttp | 3.12.15 | 18 CVEs | 3.13.3+ | Transitive dependency — update via Renovate |
| urllib3 | 2.5.0 | 3 CVEs | 2.6.0+ | Update via Renovate |
| requests | 2.32.5 | 1 CVE | 2.33.0+ | Update via Renovate |
Actionable recommendation:
- Migrate from
python-josetoPyJWT:python-joseis no longer actively maintained and pulls in theecdsalibrary (CVE-2024-23342 — timing side-channel has no fix).PyJWTis actively maintained and usescryptographynatively withoutecdsa. The migration is straightforward since both libraries have similar APIs (jwt.encode(),jwt.decode()). This is the only dependency issue that Renovate/Dependabot cannot automatically resolve. - All other CVEs: Should be addressed by Renovate version-bump PRs. No manual action needed.
1G. ~~CI/CD security pipeline gaps~~ REVIEWED¶
- Files:
.github/workflows/,.pre-commit-config.yaml
Confirmed in place:
- Bandit SAST runs via pre-commit on every commit (backend Python code)
- Semgrep SAST runs in CI for frontend JavaScript/TypeScript code
- Dependabot vulnerability alerts with weekly scan
- Renovate version-bump PRs with severity-based auto-merge policy
- Pre-commit checks: ruff, black, mypy, cspell, bandit
Gap identified:
- No secret scanning in CI: No
trufflehog,gitleaks, ordetect-secretsintegration. GitHub's built-in secret scanning / push protection should be enabled in the repository settings (Settings → Code security → Secret scanning → Enable). This is a GitHub Enterprise feature and requires no workflow changes — just enable it in the repository settings.
Phase 2: Active testing (penetration testing)¶
Tooling¶
| Tool | Purpose | Install |
|---|---|---|
| Schemathesis | Property-based API fuzzing from OpenAPI spec — auto-generates thousands of test cases covering boundary values, malformed inputs, type confusion, response schema validation, and stateful chained requests | poetry add --group dev schemathesis |
| Hypothesis | Property-based testing for individual functions (token parsing, password hashing, input validation) — generates random inputs to find edge cases | poetry add --group dev hypothesis |
| pytest | Targeted manual security test cases (already installed) | — |
| httpx | Craft malicious HTTP requests for CSRF bypass, cookie manipulation, header injection (already installed) | — |
Implementation status¶
All Phase 2 tests are in backend/tests/test_security_pentest.py (38 tests, all passing).
| Category | Tests | Status |
|---|---|---|
| Crypto round-trips (Hypothesis) | TestPasswordHashRoundTrip (2), TestCSRFTokenRoundTrip (2), TestPasswordResetTokenRoundTrip (1), TestJWTTokenRoundTrip (2) |
✅ All pass — 200 examples each |
| JWT manipulation | TestJWTManipulation — alg:none, wrong secret, expired, tampered payload, stale token_version |
✅ All 5 attacks rejected |
| CSRF bypass | TestCSRFBypass — missing header, wrong token, cross-user token |
✅ All 3 bypasses rejected |
| Authentication attacks | TestAuthenticationAttacks — login rate limit, register rate limit, anti-enumeration, short password |
✅ All 4 pass |
| Privilege escalation | TestPrivilegeEscalation — user→admin POST/PATCH /users, CBAC self-update |
✅ All 3 escalations blocked |
| Input validation | TestInputValidation — extra fields, SQL injection, XSS, fuzz login (Hypothesis, 50 examples) |
✅ All 4 pass, no 500s |
| Cookie security | TestCookieSecurity — HttpOnly flags, logout clears cookies |
✅ All 2 pass |
| Unauthenticated access | TestUnauthenticatedAccess — 10 protected endpoints parametrised |
✅ All return 401 |
Hypothesis findings during development:
- Whitespace-only strings (e.g. \r) correctly rejected by create_csrf_token input validation — strategy constrained to non-whitespace-only strings
Monthly CI workflow¶
.github/workflows/security-pentest.yml — runs on 1st of each month at 03:00 UTC and via workflow_dispatch. Uploads JUnit XML results as a 90-day artefact. Slack notification on failure.
~~2A. Authentication attacks~~ TESTED — ALL PASS¶
Covered by TestJWTManipulation, TestAuthenticationAttacks, TestCookieSecurity, and TestJWTTokenRoundTrip in test_security_pentest.py:
- JWT alg:none, wrong secret, expired tokens, tampered payloads → all rejected
- Stale token_version after password change → rejected
- Login and registration rate limiting → enforced
- Registration anti-enumeration → identical generic errors
- Cookie HttpOnly and SameSite flags → verified
- Logout clears cookies → verified
~~2B. Authorisation testing (IDOR / privilege escalation)~~ TESTED — ALL PASS¶
Covered by TestPrivilegeEscalation and TestUnauthenticatedAccess:
- Regular user → POST /api/users (admin endpoint) → 403
- Regular user → PATCH /api/users/{id} (privilege escalation) → 403
- Regular user → PATCH /api/cbac/my-competencies (self-promote) → 403
- 10 protected endpoints → 401 without authentication
~~2C. Injection testing~~ TESTED — ALL PASS¶
Covered by TestInputValidation:
- SQL injection payloads in login → safe (400/401, no 500s)
- XSS payloads in registration → safe (no 500s)
- Extra/unexpected fields → rejected by Pydantic extra='forbid' (422)
- Hypothesis fuzz: 50 random input pairs to login → no 500s
~~2D. API abuse testing~~ TESTED — ALL PASS¶
Covered by TestAuthenticationAttacks and TestCSRFBypass:
- Login brute force → rate limited after 5 attempts (429)
- Registration spam → rate limited after 3 attempts (429)
- CSRF bypass with missing/forged/cross-user tokens → all rejected
~~2E. Web push subscription abuse~~ TESTED IN PHASE 1C¶
Covered by test_subscribe_requires_auth in test_push.py:
- Anonymous push subscribe → rejected (authentication required)
Key files¶
| Area | Files |
|---|---|
| Auth/Security | security.py, main.py, config.py, models.py |
| Access control | messaging.py, organisations.py, cbac/, system_permissions/ |
| Input validation | schemas/ |
| Vulnerable | email_templates.py — unsanitised markdown |
| Infrastructure | compose.dev.yml, caddy/prod/Caddyfile, Dockerfiles |
| Frontend security | api.ts, MarkdownView.tsx, auth/ |
| Pentest tests | tests/test_security_pentest.py |
| CI | .github/workflows/, .github/workflows/security-pentest.yml |
Verification¶
- ~~Fix 1A → run
just ub, write test confirming HTML tags are stripped~~ ✅ - ~~Phase 1 produces a findings document with CVSS-like severity ratings~~ ✅ Above
- ~~Phase 2 produces pytest tests in
backend/tests/test_security_pentest.pyas regression tests~~ ✅ 38 tests - Review Dependabot alerts dashboard for any open vulnerability advisories
- Review Bandit output from pre-commit:
pre-commit run bandit --all-files - ~~Each finding should be reproducible with a curl command or test case~~ ✅ All findings have tests
- ~~Pentest run produces zero server errors (
5xx) against all endpoints~~ ✅ Hypothesis fuzz confirmed - ~~Monthly CI workflow created and posts artefact with results~~ ✅
security-pentest.yml
Decisions¶
- All testing inside Docker containers per project conventions
- Severity ratings: Critical / High / Medium / Low / Informational
- Focus: OWASP Top 10 + healthcare-specific risks (PHI exposure, clinical data integrity)
- Out of scope: Social engineering, physical security, third-party SaaS (Resend, GCS)
Further considerations¶
- python-jose vs PyJWT —
python-josehas had historical CVEs; consider flagging for future migration toPyJWT - TOTP secret encryption at rest — currently plaintext in DB; could encrypt with application-level key (medium priority, depends on DB encryption posture)
- Password reset flow — now implemented (
/api/auth/forgot-password,/api/auth/reset-password) with time-limited signed tokens; needs to be included in Schemathesis test coverage