v1.3.25 -- Scenarios + AppSec threshold tuning UI¶

The remaining two items from the v1.3.20+ elevated scope. Both follow the v1.3.19 sentinel-file pattern; share script-extension work; co-developed in this release.

Five-strike upstream pattern (continued)¶

A 5-min pre-implementation check confirmed LAPI v1.7.7 has no hub-state API. Probed /v1/hub, /v1/scenarios, /v1/collections -- all 404. Read the full v1 route table in pkg/apiserver/controllers/controller.go: only alerts/decisions/heartbeat/watchers exist. The CrowdSec hub state lives entirely on the local filesystem managed by cscli, never exposed via the API.

This makes v1.3.25 the fifth release where the documented upstream surface turned out to be narrower than expected (v1.3.18 client_ip semantics, v1.3.20 plugin scope=Country absence, v1.3.22×2 LAPI filter asymmetry + silent chunk drops). The 5-min pre-check pattern saved hours of building against a nonexistent endpoint -- documented as a working-agreement memory.

What ships¶

Source-of-truth: read-only mount of the crowdsec config volume¶

docker-compose.yml adds one mount to the argos service:

volumes:
  - argos_data:/data
  - argos_shared_setup:/data/shared
  - crowdsec_config:/crowdsec-state:ro    # NEW

The panel reads /crowdsec-state/scenarios/*.yaml to enumerate installed scenarios. Each .yaml file is a symlink whose target encodes the owner prefix (crowdsecurity/foo); the reader recovers the canonical name from the link target.

Read-only is critical -- the panel never writes here; setup-appsec.sh owns the writes via cscli. The mount is operator-trusted (same volume crowdsec already uses).

`backend/internal/security/scenarios` package¶

Reads the mount, applies the operator-disabled set from settings, returns a typed list. Graceful failure: missing / empty mount yields IsAvailable=false so the UI can render "is crowdsec running?" rather than crashing.

6 unit tests cover empty-mount, symlink resolution, disabled- set application, short-name tolerance (operators tend to type appsec-native not crowdsecurity/appsec-native), non-yaml-file filtering, CSV format dedupe.

Two new sentinels¶

/data/shared/argos-disabled-scenarios.txt — one canonical scenario name per line. setup-appsec.sh runs cscli scenarios remove --force per line. Idempotent on re-run.
/data/shared/argos-appsec-tuning.txt — key=value format with inbound_threshold and outbound_threshold. The script regenerates /etc/crowdsec/appsec-rules/argos-tuning.yaml from these values on its next run; missing / empty sentinel falls back to v1.3.19 defaults (15 inbound, 4 outbound).

backend/internal/security/files.go gains WriteDisabledScenarios and WriteAppSecTuning. Reuses the v1.3.19 atomic-write helper.

Six new endpoints¶

GET    /api/security/scenarios
       -> { scenarios: [{short_name, source, canonical_name,
                         path, disabled}],
            is_available, mount_path, disabled_count,
            last_modified_at, last_applied_at, reload_needed }

PATCH  /api/security/scenarios/{name}
       body: {"disabled": true|false}
       -> 200 + { name, disabled, changed }
       (Idempotent: setting to current state is a no-op.)

POST   /api/security/scenarios/mark-applied
       -> 200 + { last_applied_at }
       (Operator asserts they ran setup-appsec.sh; clears the
        Pending Reload badge.)

GET    /api/security/appsec-tuning
       -> { inbound_threshold, outbound_threshold,
            last_modified_at, last_applied_at, reload_needed }

PATCH  /api/security/appsec-tuning
       body: { inbound_threshold?, outbound_threshold? }
       -> 200 + new state
       (Partial update; validates 1..100 range.)

POST   /api/security/appsec-tuning/mark-applied
       -> 200 + { last_applied_at }

All audit-logged. PATCH-with-state (not POST .../toggle) for idempotency, matching the existing /appsec/mode pattern.

setup-appsec.sh extension¶

apply_panel_sentinels() gains two new behaviors:

Regenerate argos-tuning.yaml from /shared/argos-appsec-tuning.txt. Runs after copy_file stages the v1.3.19-default version, so operator-set values override the upstream copy.
Apply panel-disabled scenarios by reading /shared/argos-disabled-scenarios.txt and running cscli scenarios remove --force per non-comment line. Tolerates blank lines + # comments. Idempotent.

Order in the script: install collections (idempotent restore) → copy files → regenerate tuning → run hardcoded v1.3.19 removes → apply panel disables → write whitelist → reload. An operator who re-enables a previously-disabled scenario via the UI sees it actually come back on the next script run because install_collection runs first.

UI: two new tabs in `/security`¶

Tab strip grows from 3 to 5 tabs: Banned IPs · Whitelist · Activity · Scenarios · AppSec + the existing Hosts ↗ link on the right.

Layout fits comfortably on 1200px+ viewports. Narrower viewports may overflow; switching to a dropdown is deferred to a future release if the issue surfaces in browser smoke.

Scenarios tab¶

Lists installed scenarios with source / status / per-row Disable or Re-enable button. The empty-mount state renders an explainer with the three likely causes (crowdsec not running, missing volume mount, scenarios not yet installed).

AppSec tab¶

Single form: inbound threshold + outbound threshold (1..100). Preset hints: "CRS default: 5; argos default: 15" so the operator sees the trade-off at a glance.

Pending reload badge (shared)¶

Both tabs render a persistent amber badge at the top when last_modified_at > last_applied_at. The badge tells the operator to run setup-appsec.sh and click "Mark as applied". The badge survives page refreshes -- a step better than the v1.3.19 whitelist's toast-only UX.

Limitation documented in the badge tooltip: if setup-appsec.sh errors, marking applied won't fix the underlying CrowdSec state. v1.3.25 trusts the operator's assertion; real drift detection (panel queries cscli scenarios list directly) is v1.3.26+ work.

Tests¶

6 scenarios package tests (mount-availability, symlink resolution, disabled-set application, short-name tolerance, non-yaml filtering, CSV dedupe).
All 23 existing backend test packages still green (sessions, country, publicip, crowdsec, db, api, etc).

Smoke gate¶

Per the working agreement (smoke verifies effect, not specs):

Disable scenario via UI → panel writes argos-disabled-scenarios.txt → "Pending reload" badge appears → operator runs docker compose exec crowdsec /setup-appsec.sh → cscli scenarios list confirms the scenario was removed → operator clicks "Mark as applied" → badge clears.
Re-enable scenario via UI → sentinel rewritten without that scenario → re-run setup-appsec.sh → install_collection reinstalls the previously-removed scenario → cscli scenarios list shows it back.
Change inbound_threshold via UI (e.g. 15 → 10) → sentinel rewritten → reload script regenerates argos-tuning.yaml with the new value → head /etc/crowdsec/appsec-rules/argos-tuning.yaml shows the new threshold.
Mark as applied → badge clears → no false-positive pending state.
Empty crowdsec scenarios dir → UI shows the explainer, no crash.

NO tag until smoke real PASSes against prod stack.

Files changed¶

Backend¶

backend/internal/security/scenarios/scenarios.go (new)
backend/internal/security/scenarios/scenarios_test.go (new) -- 6 tests
backend/internal/security/files.go -- WriteDisabledScenarios, WriteAppSecTuning
backend/internal/api/security_scenarios.go (new) -- 6 handlers
backend/internal/api/handlers.go -- ScenariosReader field
backend/internal/server/server.go -- 6 new routes
crowdsec/setup-appsec.sh -- regenerate argos-tuning.yaml
apply panel-disabled scenarios
docker-compose.yml -- crowdsec_config:/crowdsec-state:ro read-only mount on argos

Frontend¶

frontend/src/api/client.ts -- 6 new methods + types (SecurityScenarioItem, SecurityScenariosResponse, SecurityAppSecTuning).
frontend/src/pages/Security.tsx -- two new tabs (ScenariosTab, AppSecTab) + shared PendingReloadBadge component. Tab strip TABS const grows from 3 to 5.

Docs¶

docs/release-notes/v1.3.25.md (this file)
CHANGELOG.md, mkdocs.yml, version bump

Upgrade¶

cd argos-edge
git pull
docker compose build
docker compose up -d

The new compose-volume mount means docker compose up -d recreates the argos container (mount surface change). No migrations. No env vars.

After the deploy, visit /security, click Scenarios -- the table should populate from the read-only mount within seconds.

If your operational dir is separate from your git checkout¶

crowdsec/setup-appsec.sh is bind-mounted from the crowdsec/ directory next to your docker-compose.yml. If you keep your git checkout (argos-edge/) separate from the dir docker compose runs from (argos-prod/ or similar), you must sync the crowdsec/ tree on every release that touches the script. v1.3.25 prod-smoke caught this: the compose-volume mount + the panel image were both at v1.3.25, but /setup-appsec.sh (bind-mounted) was the v1.3.19 version, breaking the entire script-reload chain.

# After git pull on the source dir, sync to the operational dir:
rsync -a --delete \
    /path/to/argos-edge/crowdsec/ \
    /path/to/argos-prod/crowdsec/
# Then docker compose up -d as above.

Future releases: the dual-dir gap is documented in docs/operations/persistence.md (the "Bind-mount repo files" note clarifies that ./crowdsec/* lives in git and must be present at the path the running compose resolves it from). A Makefile deploy target or a scripts/sync-prod.sh helper is on the v1.3.26 backlog if the manual sync proves error-prone in practice.

Smoke automation¶

scripts/smoke/scenarios-toggle.sh and scripts/smoke/appsec-tuning.sh automate the full panel-PATCH → sentinel → script reload → cscli/yaml inspect chain. They are the canonical smoke gate for v1.3.25 work going forward; an operator-side browser smoke is no longer the gate.

# Capture an active session token first:
SESSION=$(docker run --rm -v argos_prod_data:/data alpine sh -c \
    "apk add --no-cache sqlite >/dev/null 2>&1
     sqlite3 /data/argos.db \"
       SELECT token FROM sessions
        WHERE expires_at > datetime('now')
        ORDER BY id DESC LIMIT 1;\"")

ARGOS_SESSION_TOKEN="${SESSION}" \
PANEL_BASE_URL=http://localhost:9180 \
CROWDSEC_CONTAINER=argos-prod-crowdsec \
TEST_SCENARIO=crowdsecurity/CVE-2017-9841 \
    ./scripts/smoke/scenarios-toggle.sh

ARGOS_SESSION_TOKEN="${SESSION}" \
PANEL_BASE_URL=http://localhost:9180 \
CROWDSEC_CONTAINER=argos-prod-crowdsec \
    ./scripts/smoke/appsec-tuning.sh

Both scripts include cleanup traps so a partial failure restores the panel state on exit.

Not in v1.3.25¶

Drift detection (panel queries cscli scenarios list to compare panel-intent vs actual CrowdSec state). v1.3.26+ if the manual "Mark as applied" trust model proves unsatisfactory in dogfood.
Scenario descriptions (parse .index.json). Deferred until the format is confirmed stable across CrowdSec versions.
Per-scenario rule-ID disable (e.g. disable just CRS rule 920420 without disabling the whole crs collection). Not asked for; would be a larger redesign.