Skip to content

v1.3.29 -- Per-host true_detect_mode (dormant column activated)

Closes the longest-running deferred feature in argos-edge. hosts.true_detect_mode has existed as a dormant column since migration 028 (v1.3.19). v1.3.27 attempted to ship it via Caddy per-handler appsec_url but was blocked by upstream plugin limitations. v1.3.28 deferred to v1.3.29 with a planning doc. v1.3.29 ships it via CrowdSec profiles.yaml -- empirically validated end-to-end.

What it does

Toggling True detect mode on a host (Edit host -> Access section) makes the panel write a profiles.yaml entry that suppresses LAPI decision creation for AppSec alerts whose target_fqdn / target_host matches the host. Alerts continue to be logged (visible in cscli alerts list + the panel's Activity tab). Only the alert -> scenario -> ban pipeline is intercepted.

Useful for hosts whose legitimate traffic triggers AppSec false positives (socket.io polling, monitoring tools, hot-reload dev servers). The host's actual AppSec block-or-allow behaviour is unchanged; only the cross-host IP-banning side effect is silenced.

Why this required a spike + smoke

The v1.3.27 release rejected the profiles.yaml path as "upstream-unsupported", suggesting the operator pivot to a custom Caddy template emit. v1.3.27's pre-flight verification proved that path was architecturally impossible against the pinned plugin.

v1.3.29's spike re-tested profiles.yaml empirically:

  • PHASE 0 (~30 min): confirmed Alert.Events[].Meta is accessible from filter expressions via expr-lang's any() iterator. The argos-managed marker block already exists in /etc/crowdsec/profiles.yaml -- the v1.3.19 author had started this implementation, written the panel-side WriteTrueDetectHosts writer, but stopped before wiring the script-side consumer. v1.3.29 finishes the wiring.
  • PHASE 1 (smoke surfaced two issues):
  • Inband WAF alerts use target_fqdn meta; the spike sampled an outofband-scenario alert with target_host. Filter must check both keys.
  • The original "scenario contains 'appsec'" gate was too narrow -- inband WAF scenarios are named like "anomaly score block: ..." with no "appsec" substring.
  • PHASE 2 (smoke design pivot): a real-attack-burst smoke produced 0 decisions in BOTH detect-on and detect-off phases (false positive), because the argos stack's inband AppSec listener does not feed crowdsec-appsec-outofband (which filters on evt.Appsec.HasOutBandMatches == true). Switched the smoke to synthetic LAPI alert injection: hand-craft an alert with remediation=true + target_fqdn=<test-host> and POST directly to LAPI. This isolates the profile filter as the only thing capable of suppressing the resulting ban.
  • PHASE 3 (final blocker): smoke kept failing with "filter loaded but didn't match". Root cause: docker bind mounts pin the inode at container-start time. rsync replaces files via tempfile+rename (new inode), so the freshly-synced setup-appsec.sh was NOT visible inside the running crowdsec container. Fix: docker compose restart crowdsec after make sync-prod. Documented in the upgrade section + the seven-strike memo.

Smoke 8/8 green after the bind-mount fix:

[1/8] PUT true_detect_mode=true       OK
[2/8] sentinel contains test host      OK
[3/8] setup-appsec.sh splices + bounces crowdsec   OK
[4/8] inject synthetic alert (target_fqdn=h, remediation=true)  OK
[5/8] alert appears in cscli alerts list   OK
[6/8] zero decisions for source IP   <-- detect ON suppression
[7/8] PUT true_detect_mode=false; re-splice; re-inject   OK
[8/8] one decision for source IP    <-- detect OFF baseline

What ships

backend/internal/security/files.go::WriteProfilesYAML

Pure-string formatProfilesYAML(domains []string) string formatter wrapped by a DB-query + atomic-write. Five unit tests covering zero-hosts placeholder, single-host filter shape, multi-host in-list join (sorted, alpha-first), idempotent re-runs (byte-identical output), and quote-escaping for defensive future-proofing.

The emitted YAML when at least one host has the toggle on:

name: argos_true_detect_mode
debug: false
filters:
 - 'len(Alert.Events) > 0 && any(Alert.Events, any(.Meta, (.Key == "target_host" || .Key == "target_fqdn") && .Value in ["host1.example", "host2.example"]))'
decisions: []
on_success: break
---

decisions: [] matched + on_success: break together suppress fall-through to default_ip_remediation.

crowdsec/setup-appsec.sh::splice_profiles_yaml

awk-based block replacement between the existing markers in /etc/crowdsec/profiles.yaml. Idempotent; sets PROFILES_CHANGED=1 if the file actually changed. main inspects the global flag and kill -TERM 1 to bounce the container (CrowdSec does NOT hot-reload profile changes via SIGHUP; a full restart is required). docker's restart: unless-stopped policy brings the container back in ~5s.

Frontend

  • "True detect mode" checkbox in the Edit Host modal Access section (next to "LAN-only access"). Tooltip explains the semantics + reload requirement.
  • DETECT badge on the hosts list beside hostnames where the flag is set.
  • Host + HostInput TypeScript types in client.ts now carry true_detect_mode.

Smoke

scripts/smoke/true-detect-mode.sh -- the EFFECT-verifying smoke described above. Operator passes TEST_HOST=<existing host> env var; script handles JWT login + synthetic alert injection automatically.

Files changed

  • backend/internal/security/files.go (WriteProfilesYAML + formatProfilesYAML)
  • backend/internal/security/files_test.go (new, 5 tests)
  • backend/internal/reconciler/reconciler.go (call swap)
  • backend/cmd/argos/main.go (argosVersion bump)
  • frontend/src/api/client.ts (Host + HostInput types)
  • frontend/src/pages/Hosts.tsx (form state + checkbox + list badge)
  • frontend/package.json (version bump)
  • crowdsec/setup-appsec.sh (splice_profiles_yaml + main hook)
  • scripts/smoke/true-detect-mode.sh (new)
  • docs/release-notes/v1.3.29.md (this file)
  • CHANGELOG.md, mkdocs.yml

Upgrade

cd ~/argos-edge
git pull
make sync-prod                 # rsyncs the new setup-appsec.sh
                               # + scripts/smoke/* into the
                               # operational dir
docker compose -f /path/to/argos-prod/docker-compose.yml \
    restart crowdsec           # CRITICAL: refreshes the
                               # bind-mount inode for the new
                               # setup-appsec.sh; without this
                               # the container keeps running
                               # the old script

Then rebuild + redeploy the panel for the version-string bump:

cd ~/argos-prod
docker build -f backend/Dockerfile -t argos-prod-argos:v1.3.29 .
# update docker-compose.override.yml: image: argos-prod-argos:v1.3.29
docker compose up -d --force-recreate --no-deps argos

The first time you toggle a host's true_detect_mode and run setup-appsec.sh, the script splices the filter and restarts crowdsec automatically (~5s downtime).

Bind-mount inode invalidation: documented gotcha

Adding to the seven-strike upstream-behaviour memo: rsync replaces files via tempfile+rename, which changes the inode. docker bind mounts pin the inode at container start. After make sync-prod of any bind-mounted script, the operator must docker compose restart <service> for the new file to be visible inside the container. Affects: setup-appsec.sh, Caddyfile, crowdsec/acquis.{yaml,d/}, crowdsec/appsec-*. NOT a CrowdSec or Caddy bug -- standard docker bind-mount semantics. Documented in docs/operations/deployment.md's recovery section + this release's upgrade path.

Not changed

  • All v1.3.28 backend / frontend / migration code unchanged.
  • LAPI WAL mode (v1.3.28) untouched.
  • Drift detector (v1.3.27) untouched.
  • Migration 031 still latest.
  • No new migration in v1.3.29 (column already existed since 028).