Changelog — Tiyi

2026-06-24 Shipped

One-binary scale licensing + first-run admin

One binary whose single enforced distinction is scale, not features. Every WAF capability stays in every edition; a vendor-signed license changes only the remote-node budget and displayed plan.

Community = local standalone WAF, free forever. With no valid license the full product runs on the local node and permits zero remote agents. Importing a license activates its Pro or Enterprise node budget without replacing the binary.
One always-on gate, inside the enrollment transaction. Legacy license.mode rows are ignored and writes are rejected. Count and insert share one transaction, so concurrent enrollments cannot slip past the signed budget. Existing sites, known agents, and local administration are never affected by license state.
One trust anchor. Licenses are Ed25519-signed against the vendor public key embedded in the binary; public-key overrides are removed. The release target pins its SHA-256 fingerprint, and schema-versioned claims reject unknown plans, invalid budgets, and bad timestamp ordering.
Safe degradation. Missing, invalid, tampered, or expired licenses resolve to Community. The data plane never stops, and known fingerprints always reconnect.
Authenticated live status. System → About shows edition, licensee/expiry, and the live "N of M remote nodes" count. License metadata is no longer exposed by the public Version RPC.
First-run admin. A standalone first boot with no users and no supplied credentials now auto-creates an admin with a random one-time password (printed once, stored only as a hash); tiyi server prints an actionable warning instead of leaving a silent dead-end login.
The vendor tool now creates private keys as 0600 with exclusive no-overwrite semantics; date-only expiries are inclusive through 23:59:59Z.

2026-06-18 Shipped

Site path routing (M9)

A site used to match on host only. It can now fan out by path prefix to multiple upstream pools behind one host and one certificate — API-gateway-style routing — with the data plane and every cross-layer path scope kept consistent.

Translator fan-out. One host route expands into per-prefix subroutes, each to its own upstream pool; the existing zero-route output stays byte-identical to before — a golden test pins it.
Per-route health. Each route gets a post-apply health probe (200–399 only); a down route backend fails the apply gate even when the site default is healthy.
Unmatched policy. Unmatched paths fall through to the site default upstream, or to a strict 404 allowlist that needs no default.
Full attribution. Every access-log row records the route_id and upstream_id that served it — through the parser, daily partitions, API/CLI filters, export, and the web log views.
One canonical path scope. /admin vs /Admin resolve identically across the WAF, IP lists, and the rate limiter; a case-insensitive hint, live canonical echo, and named case-duplicate error guard every path input on web + CLI.
Drive it from tiyi site routing get/set or the new web routes editor. Post-implementation review hardened zero-route/404 validation, atomic compensation capture, stable route IDs across replace, strict per-pool endpoint/scheme/health-URI checks, and canonical scoped CRS overrides.
Verified: go test ./... green, buf lint clean, vue-tsc clean. Interactive browser QA was skipped at operator request.

2026-06-18 Shipped

Policy rate-limit editor

The policy Rate Limits tab graduated from a sketch into an operator-usable editor, with validation that refuses any rule the agent could not actually enforce.

Endpoint rows. Path-pattern limits (e.g. /login, /api/auth/**) keyed by the resolved client IP, returning HTTP 429 when blocking — or a log-only canary mode to observe a new limit before enforcing.
Client scopes. Global ceilings across all paths keyed by ip, the session cookie, or a named header (header:<name>). Burst 0 means the burst equals the per-minute limit.
Inline validation. Save is blocked for missing paths/headers, non-positive limits, negative bursts, duplicate patterns/scopes, and unknown actions.
Canonicalization. The API folds legacy client_ip → ip and detection → log; the compiler and runtime stay backward-compatible with historical rows. Enforced before Coraza, with no persistent collections.
Verified: go test ./... green, buf lint clean, vue-tsc clean.

2026-06-09 Shipped

AI advisory layer + interactive Copilot

An optional, default-off LLM advisory layer that sits beside the deterministic WAF — in the control plane, never in the request path, and never mutating state on its own. It enriches incidents, drafts human-approved tuning, and answers plain-English questions about your logs.

Streamed analysis. A global AI Copilot slide-over explains any incident or single log event, or analyzes a filtered log result set — streamed token-by-token. Subjects are reduced to redacted whitelist projections, never raw audit JSON.
Conversational agent. A StreamChat assistant answers open questions ("top 10 attackers in the last hour", "was example.com attacked today, from which IPs?") by tool-calling over five read-only, tenant-scoped log queries — it adds no new log surface.
Advisory, never autonomous. Incident enrichment, policy-tuning patches, NL→log-query translation, draft custom rules (forced log-only canary), and campaign correlation are all proposals; the only state-changing path is an explicit operator Apply routed through the existing audited mutation, gated by ai:apply.
Provider-agnostic + safe. OpenAI-compatible (OpenAI / Ollama / vLLM / LM Studio / llama.cpp) and Azure OpenAI; the key is KEK-encrypted, never a setting. Every prompt is redacted (§ 8.6), dual-RBAC gated (ai:read plus the subject's own incident:read/log:read), rate-limited, and panic-isolated.
Verified: go test ./... green, buf lint clean, vue-tsc clean, i18n_check.py 0 missing, plus a Playwright QA pass against a mock provider.

2026-06-09 Shipped

Air-gap icons, telemetry explorer, static binary

Offline icon bundle. The admin console used to fetch icon SVGs from the public Iconify API at runtime — broken in air-gapped deployments. Icons are now collected from the frontend source + backend menu seeds and bundled at build time, registered before first paint.
Telemetry explorer. A new Top-N / time-series view over the built-in telemetry pipeline (QPS, top-K, series) — no external time-series database required.
Static binary. Release builds are now statically linked (CGO_ENABLED=0, pure-Go SQLite), dropping the build-host glibc dependency so the same artifact runs on older distributions (CentOS 7+).
Operator fixes. Stable per-event access-log IDs for exact lookup, boot-time config apply, and a fix that lands permission-restricted operators on the first page they can actually open instead of a hard-coded dashboard.

2026-05-31 Shipped

Security incident aggregation — Phase 1–4

A security_incident layer on top of the per-request security_event stream. It composes with — does not replace — the source-of-truth event rows, the audit chain, the alert evaluator, and SIEM egress.

Phase 1 — aggregation. Same (tenant, site, client_ip, attack_class) over a sliding idle window collapses into one incident. A security_incident_event link table (with event_id uniqueness) holds the relationship; security_event.incident_id is a query-accelerator cache. Detection follows OWASP AppSensor; the wire shape follows the OCSF class taxonomy. Migrations 0020–0021.
Phase 2 — operator polish. Reopen, merge, live-tail, and per-site incident overrides. The UI keeps incidents in view after merge/reopen and closes the drawer on re-activate.
Phase 3 — deterministic enrichment. Every incident is tagged at creation with its MITRE ATT&CK technique + sub-techniques and kill-chain stage (derived from attack_class + rule-id families) and a most-frequent country/ASN geo rollup. An operator-extensible mitre_mapping overlay (migration 0023) overrides the built-in mapping. Example: an RCE + 933 event yields T1190 [T1059, T1505.003] / execution and geo CN / AS4134.
Phase 4 — automated response. An incident_response_rule resource + IncidentResponseRuleService CRUD, and a responder that mutates live controls (deny IP list / rate-limit endpoint / alert webhook) on incident state transitions through the existing audited mutation paths. Default-off at three layers — no rule until created, every rule ships enabled=0, and the tenant kill switch incident.response_enabled defaults false. Every mutation is attributed to system:incident-response in the audit chain. A 60-second TTL sweeper auto-reverts expired deny entries through the audited RemoveIPListEntries path and recompiles the affected policies.
Surfaced via the tiyi incident / tiyi incident-response CLI verbs and a web UI under WAF → Automated Response. The Sigma-style operator-defined correlation DSL is explicitly deferred — the fixed 12-class + 4-axis-key correlation is the shipped v1.
Verified: go test ./... green, buf lint clean, vue-tsc clean, i18n_check.py 0 missing, plus a per-phase CLI/API runtime smoke test.

2026-05-31 Shipped

Alerting redesign — Phases 1–4 (D1–D14)

The evaluator's deterministic, poll-based shape and SQLite storage are kept; the missing half of every mature alerting system is added — the evaluation/notification split the industry converged on (Prometheus + Alertmanager), with no new external dependencies.

Correctness fixes. Closed a UI-exposed rule kind the backend never implemented, threshold counts that were silently capped at the query page size, and channel kinds whose UI/API contracts did not match their delivery implementation.
Lifecycle, not fire-once. A notification dispatcher drives pending → firing → recovered/resolved with for-duration debouncing, re-notification while an alert stays open, and auto-resolve — replacing the old "page exactly once, then go quiet" behavior.
Durability. A durable notification outbox (migration 0026) plus retry-with-backoff closes the crash-window that could permanently drop the only notification for a new alert. Evaluator state persists in alert_eval_state (migration 0025).
Security. Channel secrets (webhook URLs, PagerDuty keys, Slack tokens) moved from plaintext JSON to KEK-encrypted storage and are no longer returned verbatim on read or into the audit chain (migration 0024).
Quiet by default. Grouping, inhibition, and silences, plus a per-alert note timeline, rule lifecycle, and channel routing.
Implemented the previously-missing RPCs, audited silence mutations, and pruned dead cert UI along the way.

2026-05-31 Shipped

Pre-merge security + QA hardening

Fail-open RBAC gap closed — several system and certificate admin RPCs did not enforce a permission check and defaulted to allowing the call. Now gated, with a rbac_coverage_test.go regression test that asserts coverage across the surface.
Standalone dashboards read zero traffic — telemetry.enabled defaults true, which gates off the legacy rollup writer, but the embedded StoreAccess path never called telemetry.Ingest, so access traffic was recorded nowhere the dashboard reads. Fixed by ingesting in StoreAccess, with a regression test.
Race-aware test budget — TestFlushReturnsTrueWhenDrained had a hard 2s drain budget the race detector's overhead blew past; made it race-aware. Also fixed a pre-existing gofmt drift so make lint is green.

v3.0.0-rc.1 2026-05-27 Release Candidate

Direct page-jump pagination for log views

CursorRequest.offset with a 100k cap, mutually exclusive with the cursor path. Offset accounting runs over post-filter rows so CIDR/rule filters skip the right number of matched rows. The frontend drops cursorStack + walkForwardTo in favor of computing offset directly from the target page.

4 store tests + 2 wire tests pin offset semantics and rejection of cursor+offset misuse
Files: proto/tiyi/v1/common.proto, internal/store/log_repo.go, internal/api/log_handler.go, three log list .vue files
Verified: go test ./... green, buf lint clean, vue-tsc clean

2026-05-26 Shipped

Six post-deploy QA fixes

DEF-002 — IP-list geo:CC entries accepted at the API boundary (alpha-2/alpha-3 + curated PRIVATE/LOOPBACK buckets); the seclang compiler already expanded them at compile time, so the entry now flows through end-to-end without forking the dataset.
DEF-003 — bookmarkable parametric routes (/agents/:id, /waf/policies/:id) re-registered after the backend menu generation drops them; router.hasRoute guards against double-registration.
DEF-004 — ResolveAgentGroup now resolves selectors against ListAgents for either a saved-group id or an ad-hoc spec preview (matching by match_agent_ids ∪ tag_selectors).
DEF-005 — standalone agent commands now drained through a 2-second consumer; reload calls ApplyActiveSites; restart succeeds as a no-op (tearing down the local management plane is the wrong default).
DEF-006 — trust-profile mutations now write trust.update_profile, trust.reset_profile, trust.update_site_override, trust.reset_site_override, trust.refresh_cdn rows into the audit chain.
DEF-007 — access log default RecordingMode flipped from "off" to "on" at three layers, matching the System Settings copy "默认访问日志模式 = 完整（自动）".

2026-05-24 Shipped

Security hardening + frontend cleanup

Code-review pass on the full backend tree closed every actionable finding without dropping a feature.

KEK persistence — server only calls db.SetKEK(envelope.LoadKEK) when crypto.kek_file is explicitly configured. Previous unconditional override generated an ephemeral KEK on every restart, silently breaking encrypted cert keys, ACME accounts, and DNS provider credentials.
Persistent ed25519 bundle signing key — singleton row in bundle_signing_key, KEK-encrypted at rest. Pre-warmed in store.Open after migrations and Bootstrap. Replaces the per-restart in-memory generator that was defeating the agent's TOFU pin.
Agent revision replay protection — applyConfigUpdate rejects any incoming bundle whose revision is not strictly greater than LastAppliedRevision.
Migration drift detection — internal/store/migrate.go records SHA-256 of every applied migration; boot aborts on drift.
Frontend cleanup — dropped 16 Vben demo pages and 5 supporting .ts files, rewrote the layout shell, fixed broken role checks comparing against 'standby'/'active' (real values are standalone|primary|secondary).

2026-05-24 Shipped

WAF tune review — PRs 1–8

Deep review of the policy tune drawer surfaced and closed every actionable defect across eight coordinated PRs. Highlights:

Audit chain coverage — every Upsert/Update/Delete on rule_override, custom_rule, ip_list, ip_list_binding, rate_limit_endpoint now appends a hash-chained mutation row; previous code silently bypassed it.
Score-override emit — switched from additive setvar:tx.inbound_anomaly_score=+N to per-bucket tx.<sev>_anomaly_score_pl<pl> reset+set, eliminating double-count.
Bulk upserts atomic — BulkUpsertRuleOverrides runs N upserts in a single transaction, all-or-nothing, with one bundle-hash refresh and one apply at the end.
Rule override action enum — DEFAULT|DISABLE|LOG_ONLY|SCORE_OVERRIDE; legacy disabled bool and action string deprecated; score_override migrated to optional int32.
Field-mask discipline — UpdatePolicy enforces that empty ip_list_bindings only wipes when update_mask.paths includes the field.
Path-prefix validation — rejects non-printable ASCII and SecLang quoting metacharacters at the API boundary.
i18n locale parity — 1786/1786 keys balanced across en-US and zh-CN with vue-i18n linked-message escapes for every @-prefixed literal.

2026-05-16 Shipped

Client-IP trust pipeline

One trust pipeline ends per-component XFF parsing fragmentation. Replaces the blind XFF-first parse in the rate-limiter — the security-class fix.

tenant_settings.trust_profile_json + site.trust_profile_override_json
Auto-fetched CIDR snapshots for Cloudflare, Fastly, Akamai, CloudFront, Front Door, GCLB; refresher with per-job goroutine, jittered interval, and panic isolation
Translator emits trusted_proxies + trusted_proxies_strict per http.servers block
Strict right-to-left walk with verbatim Caddy v2 byte-for-byte semantics
TrustService.Explain traces every header to the resolved client IP for any (peer, headers) tuple
Default-on alert: Origin Bypass Attempt (5-minute window, severity high)

2026-05-10 Shipped

Telemetry pipeline (Phase 1–6)

Full counters → samples → API tree → read API → Prometheus exporter pipeline end-to-end.

Phase 1 — sharded ingress ring, 10s open-bucket aggregator, per-day SQLite partitions at logs/rollup_10s/YYYY-MM-DD.db
Phase 2 — Misra-Gries Top-K with __other__ bucket so SUM(*) equals true traffic; per-day dictionary log; hour/day/month downsamplers
Phase 3 — per-IP LRU + global sample rings; append-only circular-file WAL with CRC; replay-on-boot
Phase 4 — per-site URL prefix tree with MaxNodesPerSite + MaxDepth; janitor evicts idle low-traffic leaves into api_tree_archive
Phase 5 — REST API at /api/v1/telemetry/* + frontend explorer
Phase 6 — Prometheus exporter at /metrics on the local admin socket
Legacy cutover — logsink.recordAccessRollup short-circuits when telemetry is active, ending the per-event traffic_rollup_minute write that PRD-TELEMETRY identified as the 100k EPS I/O amplifier

2026-05-09 Shipped

Frontend shell complete

Every server-seeded menu route resolves to a real page; every PRD-UI § 4 page is wired end-to-end; every reusable Tiyi component from PRD-UI § 5 ships.

Dashboard with GetDashboardStats, GetTrafficTimeseries, GetAttackDistribution, GetTopAttackers, StreamAlerts
Policy editor with 12 tabs, deep-linkable: CRS Core · HTTP Policy · Limits · Rule Tuning · IP Lists · Custom Rules · Plugins · Rate Limits · Exclusions · Preview · Versions · Test Lab
System branch: Health, Log Pipeline, Users, Roles, Settings (10 tabs), Replication, Updates (4 tabs), About
Alert Rules, Alert Channels, IP Lists, Agents detail, Telemetry Explorer, Audit + diff drawer
Resilience primitives: useResilientStream, formatRpcError, useResponsiveWidth
Full zh-CN + en-US, all $t() call sites resolve

2026-05-07 Shipped

WAF policies — full lifecycle

Store/API/Web/CLI CRUD; built-in templates (Strict/Standard/Permissive); version snapshots and rollback; engine-state switch; CRS binding; rule-override / custom-rule / IP-list / plugin / rate-limit tuning paths; SecLang preview; CRS impact preview; policy test lab.

Deterministic SecLang compilation with CRS best-practice advisory warnings
Per-site policy overrides overlay scalars without forking the policy
CRS rule-exclusion packages (WordPress, Drupal, Nextcloud, phpBB, phpMyAdmin, XenForo, cPanel, DokuWiki) with offline archive upload

2026-05-07 Shipped

Remote agents

One-use enrollment tokens, mounted AgentStream, tiyi agent runtime, live remote bundle delivery / cache / apply-result reporting, best-effort remote access/security/error log upload.

Pinned bundle-signing public-key verification, TOFU on first contact
Periodic metrics push (memory/CPU)
Agent role tracking (primary / secondary / agent)
tiyi agents issue-token | install-script | send-command | commands CLI

2026-05-06 Shipped

ACME — Phases A through D

Phase A — real RFC 8555 client (wrapping golang.org/x/crypto/acme) replaces the self-signed stub; per-(tenant, directory) account lifecycle
Phase B — multi-agent HTTP-01 with a loopback responder on every agent and (token, key_authorization) broadcast through the agent stream
Phase C — DNS-01 with pluggable driver registry: full Cloudflare driver, Route53 + Aliyun credential-validating stubs; authoritative-NS-aware propagation checker
Phase D — observability: dashboard exposes certs_expiring_14d, acme_renewals_failed_24h, etc.; default alerts seeded
UI: "Issue (ACME)" modal, per-row Renew action, "DNS Providers" page

2026-05-05 Shipped

CRS catalog + sites + log forwarding

OWASP CRS 4.25.0 ruleset embedded in the binary, auto-imported on first boot
GitHub-release fetch via SystemService.ListUpstreamCrsReleases; offline archive upload also supported
Sites lifecycle: store/API/Web/CLI CRUD/status; real Caddy JSON preview; uploaded cert and WAF policy selection
Standalone runtime: Caddy access logs through tiyi.log_forwarder; Coraza SecAuditLogType tiyi; live security tail

2026-05-03 Shipped

WAF interception fix + boundary defer recover

Closed the "empty reply from server" bug on blocked requests. Root cause: a typed-nil-in-interface panic in audit_writer.auditMessages (message.Data() returned a non-nil interface wrapping a nil *MessageData) unwound coraza-caddy's deferred tx.ProcessLogging before the WAF middleware returned its caddyhttp.HandlerError.

Three boundaries now have defer recover() at function level: audit_writer.Writer.Write, log_forwarder.writeAccessLine/writeWAFLine, and logsink workers. Cross-layer canary at /debug/logsink/stats exposes a panicked counter.

2026-05-01 Shipped

Alerts, audit SIEM, full standalone observability

Alert evaluator with security_threshold, error_threshold, audit_failure kinds and webhook / Slack / PagerDuty / Feishu / WeCom delivery
Audit-chain rows forwarded through SIEM with explicit siem.filter.include_audit flag
RFC 5424 / CEF / LEEF formats over TCP / UDP / unixgram
Vben pages for security, access, error, live-tail, audit, alerts (active / rules / channels)

2026-04-28 Shipped

Foundation

Go 1.25.0; ConnectRPC 1.19.2; 14 services under proto/tiyi/v1/*.proto
SQLite schema (full Phase-1) — audit, CRS, policy_version, config_bundle
Server modes: standalone, server/primary, secondary/standby, dashboard, agent
Local admin socket with OS-file-permission auth
Argon2id login, JWT issue/verify, refresh-token chain
Koanf config loader; SQLite open/migrate/bootstrap; permission and menu seeds
Audit spine — repo, hash chain, verifier, AuditService, mutation audit rows, tiyi audit CLI

The road to v3.0.0.