v1.3.2 — AppSec fail-open (bug fix)¶
Patch release. Fixes an availability regression where a misconfigured or missing AppSec sidecar would cascade into HTTP 500 responses for every host on the panel.
The bug¶
Pre-v1.3.2, the panel emitted apps.crowdsec.appsec_url in its Caddy config whenever appsec.mode != disabled. The bouncer plugin (hslatman/caddy-crowdsec-bouncer/appsec) defaults to fail-closed behaviour: if the AppSec HTTP endpoint is unreachable, the handler returns an error and Caddy serves HTTP 500.
That default is dangerous for argos-edge installations because:
- The CrowdSec image ships with zero AppSec collections by default. Ports
:7422and:7423refuse connections untilsetup-appsec.shis run. - AppSec mode defaults to
detectat first boot, which means the panel emits theappsec_urlfrom minute zero. - A single dead sidecar = every host on the panel 500s. Twenty hosts, one broken CrowdSec, zero working sites.
Confirmed in the wild on a freshly-deployed v1.3.1:
caddy_error.log → "crowdsec.appsec" "appsec component unavailable"
"dial tcp 172.20.0.2:7423: connect: connection refused"
curl app.example.com → HTTP 500
The fix¶
The plugin already exposes an appsec_fail_open knob on its app- level config (v0.12.1). v1.3.2 starts emitting it, defaulted to true. Requests now pass through to the backend when the AppSec sidecar is unreachable or errors; the WAF-inline round-trip is skipped for that request.
Operators who actively run AppSec and want strict enforcement can flip the new AppSec → Fail policy card in the panel from "Fail-open (default, recommended)" to "Fail-closed (strict)".
Pieces shipped¶
caddycfgemitsappsec_fail_open: true|falseinside theapps.crowdsecblock wheneverappsec_urlis set. Absent when AppSec is fully disabled so Caddy doesn't get an orphan field.- New setting
appsec.fail_open(bool, defaulttrue) wired throughapi/settings.go+ the reconciler. - New notification event
appsec_unavailable— severity warning, fires on the reachable → unreachable transition of a background 5-minute probe of the AppSec URL.appsec.fail_opentrue means traffic keeps flowing; the notification is how the operator learns their WAF-inline silently degraded. - New UI card AppSec → Fail policy with a two-radio chooser (fail-open vs fail-closed). Only shown when AppSec mode is not
disabled. - New troubleshooting section in
docs/operations/troubleshooting.mdcovering the "connection refused on :7423" symptom, the setup-appsec.sh runbook, and how to hook up the new notification rule.
What's NOT here¶
- Automatic setup-appsec.sh execution: the panel does not try to install AppSec collections for you. Fail-open prevents breakage; getting AppSec actually running is still an operator task.
- Block-mode auto-downgrade: the panel does not flip a host from block to detect when AppSec is down. Fail-open is per- request; the policy is at the bouncer level, not the host level.
- AppSec health badge in the status card: out of scope for the hotfix. Today the signal is via the notification event +
GET /api/appsec/statusconsumed by the card as before.
Migration¶
Drop-in. The new setting defaults to "true" without a DB migration (it's a key/value setting, read-on-demand; absence = default). Existing stacks pick up the fix on the next reconcile after deploying the new panel + caddy images.
No data touched. The appsec.fail_open setting surfaces in the UI immediately as "Fail-open (default)".
Rollback¶
git checkout v1.3.1docker compose build && up -d
The setting row may remain in settings on rollback — v1.3.1's settingWhitelist doesn't know about it, but the row is inert: the older reconciler never reads it. Harmless orphan, gone on the next v1.3.2+ upgrade.
Related¶
- AppSec troubleshooting section — the "connection refused on :7423" runbook.
- Notifications event catalog —
appsec_unavailableis the new event. - v1.3.1 — previous release (screenshots patch).