Key Concepts

These concepts appear throughout the system. Understanding them is essential before proceeding to configuration and operation.

The nftables Mark — Layout and Origin

Every DHCP packet that traverses the system carries a 32-bit nftables mark. The mark has structure: the high byte encodes the DHCP message type, the low 24 bits identify the client.

  bit  31              24 23                         0
       +----------------+----------------------------+
       | DHCP msg type  |     24-bit client mark     |
       +----------------+----------------------------+
              0xFF000000          0x00FFFFFF

High byte (8 bits) — DHCP message type code (0x01 = DISCOVER, 0x03 = REQUEST, 0x11 = SOLICIT v6, etc.). Set by the userspace inspector before returning the verdict. The route_by_type chain at prerouting priority -98 vmaps on this byte to dispatch the packet to the matching per-msg-type chain.
Low 24 bits (client mark) — derived from the last 3 bytes of the client MAC. Used as the lookup key in llm_*_marks, blocked_macs, and tracked_marks.

MAC:         AA:BB:CC:DD:EE:FF
Client mark: 0x00DDEEFF        (DD<<16 | EE<<8 | FF)
For a DHCPREQUEST (0x03), composite mark seen in nftables:
             0x03DDEEFF

Per-msg-type chains strip the type bits with meta mark set meta mark & 0x00FFFFFF before consulting llm_*_marks, so set lookups always operate on the bare 24-bit client mark.

Collision Risk

Two MACs that share the same last 3 bytes will receive the same client mark.

At typical network scale (thousands of clients) collisions are vanishingly rare. At very large scale (millions of unique MACs during load testing) they become more likely. Events in ClickHouse are always logged with the full 6-byte MAC, so a mark collision only affects kernel-level enforcement precision and the GUI’s mark-to-MAC reverse lookup (which displays as ??:??:??:xx:xx:xx because only the last 3 bytes can be recovered from the mark).

A collision only hurts when the colliding mark ends up in an actionable set such as blocked_macs or llm_throttled_marks — that is, when one of the two devices has communicated frequently enough or behaved suspiciously enough to be enforced against. Two MACs sharing the last 3 bytes but both behaving normally never trigger any enforcement and the collision is invisible. The scenario where a collision causes real harm — a legitimate device and a concurrently active attacker sharing the same 24-bit suffix, with the attacker triggering a block that incidentally catches the legitimate device — requires both clients to be present, both to be actively talking on the wire, and one to be misbehaving in a way that earns enforcement. At realistic deployment sizes this combination is highly unlikely.

nftables Sets

Sets are kernel-level data structures that track marks for enforcement decisions.

The active ruleset (nft-config/nft-v2.sh) defines seven enforcement sets:

Set	Default timeout	Size cap	Purpose
`blocked_macs`	2 min	2.5M	Auto-blocked clients (per-client rate-limit overage)
`tracked_marks`	1 min	2.5M	Per-mark rate-tracking entries
`llm_blocked_marks`	2 h	1M	LLM-recommended temporary blocks
`llm_denied_marks`	2 h	1M	LLM-recommended denies (drops + userspace event suppression)
`llm_throttled_marks`	7 d	1M	LLM-recommended rate limit (10/min)
`llm_allowed_marks`	30 d	1M	LLM-trusted clients, bypass all enforcement
`llm_monitored_marks`	14 d	1M	LLM-monitored clients, log only

The timeouts and size caps shown above are shipped defaults, not hard-coded limits. Every value in the table — per-set timeout, size cap, even the chain/set composition itself — is editable from the GUI’s Firewall Manager (chapter 20). The Firewall Manager exposes the active ruleset as JSON, lets you swap in pre-built throttling profiles (permissive / medium / strict), and applies your changes to the running kernel atomically. Use it whenever you need to tune enforcement to your deployment rather than reaching for nft-v2.sh by hand.

The set timeout is the kernel-side cap — it determines how long any entry can live in the set regardless of how it was created. The Action Manager additionally writes an action-level default duration into each entry it creates (sourced from config_action_definitions); see chapter 15 — Actions.

How Sets Work

Adding a mark to a set changes enforcement behavior immediately — no application restart required.

When the Action Manager adds a mark to llm_blocked_marks, the kernel begins dropping packets with that mark on the very next per-msg-type chain traversal. After the entry’s effective timeout expires (the smaller of the action default duration and the set-level timeout), the mark is removed automatically and the client resumes normal processing.

The blocked_macs set is special: it is populated automatically by the per-msg-type chains and by the dhcp_block safety-net chain when a client exceeds its rate limit. No userspace involvement is required for that path.

Enforcement Actions

Five actions control how the system treats a device’s traffic.

Action	Effect	Action default duration	nft set timeout cap
Block	All packets dropped at the kernel; events still recorded	2 h	2 h
Deny	All packets dropped at the kernel AND event generation suppressed in the userspace inspector	Permanent (0)	2 h
Throttle	Limited to 10 packets/min; excess dropped	1 d	7 d
Allow	Bypass all enforcement chains (trusted)	30 d	30 d
Monitor	Packets logged but not restricted	7 d	14 d

Note: These actions are essentially named nftables sets paired with a fixed enforcement behaviour at the kernel level. They look similar to the action types named by Automation rules (see chapter 16), and in most cases they line up — an automation rule that chooses block lands a mark in llm_blocked_marks. The two lists are not guaranteed to map 1:1 in every edge case, however: Automation also exposes the analyze pseudo-action (route to the LLM, which then emits one of these), and future profile changes in Firewall Manager could rename or merge sets without breaking the Automation API. Treat them as closely related but distinct concepts.

Actions are applied by writing the client mark to the corresponding llm_*_marks set. The per-msg-type chains evaluate the sets in fixed order on every packet:

llm_allowed_marks — accept (skip the rest).
llm_blocked_marks — drop.
llm_denied_marks — drop; inspector also suppresses event emission.
llm_throttled_marks — limit rate 10/minute accept, then drop overage.
llm_monitored_marks — log, then continue to per-client rate limit.

Allow takes precedence over block. If a mark ends up in both llm_allowed_marks and llm_blocked_marks (e.g. mid-transition), the allow wins.

mac_actions (prerouting priority 90) and dhcp_block (priority 100) act as safety nets for any packet whose mark bypassed the per-msg-type vmap dispatch — they re-run the LLM-set ladder and per-client rate limit. They are not the primary enforcement path on nft-v2.sh.

DHCP Message Type Encoding in the Mark

The high byte of the 32-bit mark identifies the DHCP message type. This is the byte route_by_type vmaps on.

DHCPv4

High byte	Message	Per-type chain (inbound)	Per-type chain (outbound)
`0x01`	DISCOVER	`v4_discover_chain`	—
`0x02`	OFFER	—	`v4_offer_out_chain`
`0x03`	REQUEST	`v4_request_chain`	—
`0x04`	DECLINE	`v4_decline_chain`	—
`0x05`	ACK	—	`v4_ack_out_chain`
`0x06`	NAK	—	`v4_nak_out_chain`
`0x07`	RELEASE	`v4_release_chain`	—
`0x08`	INFORM	`v4_inform_chain`	—

DHCPv6

High byte	Message	Per-type chain (inbound)	Per-type chain (outbound)
`0x11`	SOLICIT	`v6_solicit_chain`	—
`0x12`	ADVERTISE	—	`v6_advertise_out_chain`
`0x13`	REQUEST	`v6_request_chain`	—
`0x14`	CONFIRM	`v6_confirm_chain`	—
`0x15`	RENEW	`v6_renew_chain`	—
`0x16`	REBIND	`v6_rebind_chain`	—
`0x17`	REPLY	—	`v6_reply_out_chain`
`0x18`	RELEASE	`v6_release_chain`	—
`0x19`	DECLINE	`v6_decline_chain`	—
`0x1A`	RECONFIGURE	—	`v6_reconfigure_out_chain`
`0x1B`	INFORMATION-REQUEST	`v6_info_request_chain`	—
`0x1C`	RELAY-FORW	`v6_relay_forw_chain`	—
`0x1D`	RELAY-REPL	—	`v6_relay_repl_out_chain`

Inbound packets without recognised type bits but with a non-zero mark fall through route_by_type into default_enforce_chain.

Enable vs Active Pattern

A feature must be both enabled and active to operate.

The system separates feature availability from feature operation:

enabled (Core config, YAML, set by the administrator): is the feature available? Cannot be changed at runtime. If false, the feature’s code path is not initialised.
active (Operational config, database, set by the operator): is the feature currently running? Can be toggled at runtime through the GUI.

Both must be true for a feature to function. This two-level control lets an administrator deploy the system with certain features available but not yet activated, and lets an operator toggle features without editing config files or restarting the service.

Example

# config.yaml (Core — set by administrator)
<feature>:
  enabled: true     # feature module is loaded

# system_config table (Operational — set by operator)
# operational.<feature>.active = false   # feature is not running yet

In this state, the feature module is loaded and ready, but it does not run until the operator activates it.

Core vs Operational Config

Core config lives in YAML and requires a restart. Operational config lives in the database and changes at runtime.

Core Configuration

Core settings define infrastructure: where to connect, what to load, how many workers to run.

Set in config.yaml, read at startup. Changing them requires restarting the service. Examples:

Database connection strings (ClickHouse host, port, credentials)
Network interface and NFQueue binding (queue numbers, count)
Optional integration endpoints
API server bind address and port
Feature enabled flags

Operational Configuration

Operational settings tune behaviour: how aggressive to throttle, and other runtime tuning knobs.

Stored in ClickHouse’s system_config table (a ReplacingMergeTree), edited through the GUI without a restart. The system checks the database first and falls back to YAML defaults when no database entry exists. Examples:

Throttling and burst-detection knobs
WebSocket rate limits
Feature active toggles
Support session limits, alarm system on/off, validator middleware on/off

The Override Pattern

Database values always take precedence over YAML values.

ASCII fallback

Request for config value "operational.<feature>.setting"
    │
    ├── Check system_config table (database, queried with FINAL)
    │     Found?  → use database value
    │
    └── Not found? → use config.yaml operational.<feature>.setting value
              │
              └── Not in YAML? → use compiled default in DefaultConfig()

YAML values therefore serve as defaults for fresh installations. Once an operator changes a setting through the GUI, the database value persists across restarts and overrides YAML.

Configuration tables (system_config, system_analysis_prompts, config_action_definitions, automation_rules) use the ReplacingMergeTree engine, which does not merge rows immediately after INSERT. Every read against these tables must use the FINAL keyword or risk seeing stale duplicates. Event tables (dhcp_events, users_audit_log, the Stage-1 MVs) are plain MergeTree/SummingMergeTree and must NOT use FINAL. The full key-by-key configuration breakdown ships with the install package.

Why store config in ClickHouse alongside events? This is a deliberate operational choice. The appliance keeps configuration, audit, automation rules, and event data in one database engine so operators have a single backup target, a single set of credentials to manage, a single connection to monitor, and no extra service (Postgres, SQLite, etcd) to install, patch, or fail-over. The ReplacingMergeTree + FINAL discipline is the small price paid for that simplicity.