v1.3.4 — AppSec auth + panel metrics UX¶
Two bug fixes surfaced while operating a real AppSec-enabled stack (post-setup-appsec.sh) on v1.3.3.
Bug A — panel health probe spams CrowdSec with "missing API key"¶
Symptom — CrowdSec container log shows the panel's IP (typically 172.20.0.4) hitting :7423 every 5 minutes with:
level=error msg="Unauthorized request from '172.20.0.4:...' (real IP = ):
missing API key" module=acquisition.appsec name=argos-appsec-detect type=appsec
Correction to the initial hypothesis. The task filing suspected Caddy's bouncer plugin was missing the AppSec API key. It isn't — the plugin v0.12.1 source (internal/core/appsec.go) does set X-Crowdsec-Appsec-Api-Key from c.APIKey on every request, and Caddy's crowdsec.appsec logger shows zero appsec component not authenticated errors. The noise comes from the panel's own AppSec health probe added in v1.3.2, which hit :7423 with no auth headers.
Fix. backend/internal/appsec/healthcheck.go now reads CROWDSEC_BOUNCER_API_KEY from the panel container env (same env var Caddy reads) and sets the X-Crowdsec-Appsec-Api-Key header on every probe. Matches what the bouncer plugin does at request time. After upgrade, CrowdSec logs the probe as a normal 4xx (whatever AppSec returns for a GET to the root — usually 405) and the missing API key spam stops.
Status interpretation simplified to a pure liveness probe:
- Any HTTP response (200, 401, 403, 405, 500, etc.) → healthy. CrowdSec AppSec returns 500 to a GET-without-AppSec-headers even when the sidecar is perfectly up; an earlier draft of this fix treated 5xx as a hard down signal and fired false
appsec_unavailableevents on healthy stacks. - 404 → unhealthy. Specific: the AppSec route doesn't exist, usually because
setup-appsec.shnever ran on the CrowdSec container. - Network error (dial refused, timeout, DNS) → unhealthy.
The same simplification is applied to appsec/hub.go, which backs the Status card's "collections installed" count. That probe was previously sending unauthenticated GETs and relying on CrowdSec's 401 response as proof-of-life — same pattern, same log spam. Now it sends the bouncer key (one more silencer for the CrowdSec log noise) and accepts any HTTP status as "listener up".
Bug B — "metrics from lapi: crowdsec not configured" was misleading¶
Symptom — AppSec page in the UI bailed with a red banner:
Could not load AppSec state: metrics from lapi: crowdsec not configured. Check that the crowdsec container is healthy.
This fired whenever CrowdSec machine credentials (user + password) weren't set up. Which is common — argos-edge only requires the bouncer key to function; the machine credentials are an extra setup step (cscli machines add argos-panel --password) most homelab operators defer or skip entirely.
Root cause. The AppSec metrics endpoint calls LAPI.ListAlerts, which hits GET /v1/alerts — that endpoint requires a machine JWT, unreachable with bouncer credentials alone. The panel's API returned the crowdsec not configured sentinel verbatim as a 502, and the UI treated that as a fatal error blocking the whole page.
The message was technically true but operationally misleading: it suggested the whole CrowdSec stack was broken when in fact the bouncer was fine, the AppSec endpoint was fine, and only the metrics aggregation needed extra credentials.
Fix. Split the response:
GET /api/appsec/metricsnow returns HTTP 200 with a partial payload when the backing LAPI call fails withcrowdsec.ErrNotConfigured:
{
"window": "24h",
"mode": "detect",
"total_hits": 0,
"by_category": [],
...
"degraded": {
"code": "machine_credentials_missing",
"message": "AppSec metrics require CrowdSec machine credentials..."
}
}
- Other LAPI errors still return 502 (the endpoint-is-broken case is genuinely a hard error).
- The UI now reads
metrics.degradedand renders a scoped yellow banner where the charts would go, with a how-to link. The AppSec status card above — which reads from a different code path that works with bouncer-only creds — renders normally. - The
degraded.codeenum has three slots:machine_credentials_missing,crowdsec_unreachable,lapi_error. Only the first is emitted today; the others are reserved for future diagnostics.
Docs¶
docs/features/appsec.mdgrows a new Panel metrics vs endpoint reachability section explaining the bouncer-vs-machine credential split and how to add machine credentials to unlock metrics.docs/operations/troubleshooting.mdgains two new entries:- CrowdSec logs:
missing API keyfrom the panel's IP every 5 minutes (cause + v1.3.4 fix). - AppSec page shows "metrics unavailable: machine credentials missing" (not a bug, explains the scoped banner, points at the walk-through on the feature page).
Not changed¶
- Caddy / bouncer plugin behaviour is unchanged. Caddy continues to send the AppSec API key correctly via the plugin's built-in header-set logic. No changes to
caddycfgor the emitted Caddy config. - No DB migrations. The
degradedfield is additive on the JSON response; old frontend builds against the new backend simply ignore it. - No settings changes.
appsec.mode+appsec.fail_opensemantics unchanged.
Upgrade¶
Expected post-upgrade:
- CrowdSec
missing API keylog entries stop (usually within 5 min of the next probe cycle). - AppSec page renders without the red top-level error, even if machine credentials are still missing — the scoped banner appears instead.
- Request-time AppSec behaviour unchanged.
Related¶
- v1.3.2 release notes — where
appsec.fail_openand theappsec_unavailablenotification shipped. - AppSec (WAF-inline) — the new section on bouncer vs machine credentials.
- Troubleshooting → AppSec — both new entries for this release.