API
Nội dung này không tồn tại trong ngôn ngữ của bạn
The hub speaks two protocols:
- HTTP/JSON under
/api/*for control + history + auth - WebSocket at
/api/streamfor live updates
A canonical OpenAPI 3.1 spec
lives in the repo at api/openapi.yaml — import it into Postman /
Insomnia / Bruno / Hoppscotch for ready-to-use request collections.
There’s also a api/lumen.http file for the VS Code REST Client.
Authentication models
Two different schemes, picked per endpoint:
| Scheme | Used by | Sent how |
|---|---|---|
Session cookie (lumen_session) | Browser + UI endpoints | HS256 JWT in HttpOnly cookie set by /api/login |
Bearer token (lum_…) | Agent ingest and policy | Authorization: Bearer lum_… header |
The session cookie is set with HttpOnly, SameSite=Lax, and
Secure (under HTTPS). 30-day TTL. The bearer token is minted per
host in Settings → Hosts → Create; the plaintext is shown once
and never stored — only its SHA-256 hash lives in the DB.
Endpoints
Health
GET /healthz→ 200 {"status":"ok"}No auth. Returns 200 as long as the hub process is up. Use this for container healthchecks and uptime monitors.
Ingest
POST /api/ingestAuthorization: Bearer lum_REPLACE_MEContent-Type: application/json
{ "host": "ignored-server-overrides", "ts": "2026-05-26T08:14:00Z", "cpu_pct": 12.5, "cpu_per_core": [10.1, 14.3, 12.7, 13.0], "ram_pct": 63.2, "swap_pct": 0.0, "disk_pct": 41.5, "load1": 0.42, "load5": 0.51, "load15": 0.49, "net_rx_bps": 10240, "net_tx_bps": 5120, "disk_r_bps": 0, "disk_w_bps": 20480, "temp_c": 48.3, "containers": [ { "id": "abc123def456", "name": "nginx", "image": "nginx:1.27", "state": "running", "cpu_pct": 0.3, "mem_used_bytes": 78643200, "mem_limit_bytes": 536870912, "mem_pct": 14.6 } ]}→ 204 No ContentThe hub looks up the token, sets req.host from the host record
(any value the agent sent in host is overwritten), validates
ranges, then stores the snapshot. A 401 means the token is invalid
or absent; a 400 means a field is out of range (e.g. cpu_pct > 100).
cpu_per_core and containers are live-only — they flow
through the WebSocket but aren’t persisted. The historical
/api/hosts/{id}/metrics endpoint returns only the aggregate
scalars.
Auth — setup status
GET /api/setup-status→ 200 {"admin_exists": true}The bootstrap flag the UI uses to decide between Register and Login. Once an admin exists, register is closed.
Auth — register first admin
POST /api/registerContent-Type: application/json{"username":"admin","password":"…"}→ 201 {"user":{"id":1,"username":"admin","created_at":"…"}} + Set-Cookie: lumen_session=…One-shot. Returns 409 if an admin already exists. Password is Argon2id-hashed; minimum 8 chars.
Auth — login
POST /api/login{"username":"admin","password":"…"}→ 200 {"id":1,"username":"admin","created_at":"…"} + Set-Cookie: lumen_session=…Returns 401 on wrong credentials. The body intentionally doesn’t distinguish “no such user” from “bad password” — same response either way.
Auth — logout / me
POST /api/logout→ 204 (idempotent)
GET /api/me→ 200 {"id":1,"username":"admin","created_at":"…"}→ 401 if no/expired session cookieAuth — change password
POST /api/account/passwordCookie: lumen_session=…{"current":"…","new":"…"}→ 204→ 401 if current password is wrong→ 400 if new password is too short (<8)Rehashes with Argon2id. The session cookie stays valid (we don’t force a re-login on password change).
Hosts — list
GET /api/hostsCookie: lumen_session=…→ 200 [{"id":1,"name":"webA","created_at":"…","last_seen_at":"…"}, …]last_seen_at updates on every successful ingest; null until the
host first checks in.
Hosts — create
POST /api/hosts{"name":"webA"}→ 201 {"host":{"id":1,"name":"webA",…},"token":"lum_…"}The bearer token is returned once — copy it now or rotate. The DB only stores its SHA-256 hash.
Hosts — rotate token
POST /api/hosts/{id}/rotate→ 200 {"token":"lum_…"}Invalidates the previous token; the existing agent will start returning 401 until you re-deploy.
Hosts — delete
DELETE /api/hosts/{id}→ 204Also evicts that host’s in-memory snapshot so the dashboard stops showing a ghost card.
Hosts — set tags
PUT /api/hosts/{id}/tags{ "tags": { "tier": "prod", "env": "prod", "team": "ops" } }→ 200 { "tags": { "tier": "prod", "env": "prod", "team": "ops" } }Replaces the host’s tag set wholesale (no patch semantics). Empty map clears all tags. Keys may contain letters/digits/-_., max 64 chars; values max 128 chars, may not contain = or ,. Up to 32 tags per host. Tags surface on GET /api/hosts (tags field on each host) and feed alert rule host_selector.
Hosts — metric history
GET /api/hosts/{id}/metrics?from=2026-05-25T08:00:00Z&to=2026-05-26T08:00:00Z&step=60→ 200 { "host": "webA", "from": "2026-05-25T08:00:00Z", "to": "2026-05-26T08:00:00Z", "step_seconds": 60, "points": [ {"ts":"2026-05-25T08:00:00Z","cpu_pct":12.5,"ram_pct":63.2,…}, … ] }Server-side AVG bucketing on the
idx_snapshots_host_ts index. Limits:
- Window ≤ 7 days (rejected with 400 otherwise)
step≥ 5 s- Maximum 2000 points per response (auto-step picks ~120 buckets if omitted)
Fields returned match the persisted scalars (no cpu_per_core,
containers, or cpu_series).
Version
GET /api/versionCookie: lumen_session=…→ 200 { "hub_version": "v0.2.0", "latest_agent_version": "v0.2.0" }The running hub’s build version. Because the hub and agent ship from the
same release train, latest_agent_version mirrors hub_version — it is
the newest agent a host could be running. The web UI compares each host’s
reported system.agent_version (sent in every ingest) against this and
shows an update-available badge when a host’s agent is behind. Source
builds (no -ldflags) report "dev", which suppresses the badge to avoid
noise.
Settings — get / put
GET /api/settings→ 200 { "retention_window":"24h", "retention_interval":"1h", "agent_interval":"5s", "downsample_bucket_size":"5m", "downsample_hot_window":"24h", "downsample_archive_window":"8760h"}
PUT /api/settings{"retention_window":"6h","downsample_bucket_size":"10m"}→ 200 { "retention_window":"6h", "retention_interval":"1h", "agent_interval":"5s", "downsample_bucket_size":"10m", "downsample_hot_window":"24h", "downsample_archive_window":"8760h"}Bounds:
| Key | Range |
|---|---|
retention_window | 5 m – 365 d (or 0 to disable) |
retention_interval | 1 m – 24 h (or 0 to disable) |
agent_interval | 2 s – 1 h |
downsample_bucket_size | 1 m – 24 h |
downsample_hot_window | 1 h – 30 d |
downsample_archive_window | 1 d – 365 d |
The downsample values configure the future Parquet cold tier: bucket size is the time span represented by one archived point (5m averages old samples into one point every 5 minutes), hot window is how long full-detail raw SQLite rows are kept (24h keeps every sample for the last day), and archive window is how long compressed history is kept (8760h is about one year). Out-of-range or unparseable durations return 400. UI edits propagate to the retention loop within 30 s for retention fields; downsample fields are stored now and consumed once cold-tier compaction lands.
Alerts — rules
GET /api/alerts/rulesPOST /api/alerts/rulesPUT /api/alerts/rules/{id}DELETE /api/alerts/rules/{id}Body for create/update:
{ "name": "CPU hot", "metric": "cpu_pct", "comparator": "gt", "threshold": 80, "for_seconds": 60, "host": "web-1,db-1", "host_selector": "tier=prod,env=prod", "severity": "warning", "enabled": true, "channel_ids": [1, 3]}metric ∈ cpu_pct | ram_pct | swap_pct | disk_pct | load1 | offline. The offline metric ignores comparator/threshold and clamps for_seconds up to a 60 s minimum.
Host targeting precedence (first non-empty wins):
host_selector— comma-separatedkey=valuepairs. All must match a host’s tag set. Bare key (no=) matches when the tag exists with empty value.host— empty (all), exact name, comma list (web-1,db-1), or glob (web-*,*-prod,db-[0-9]*) usingpath.Matchsemantics. Comma list = OR across segments; each segment may itself be exact or glob.- Both empty → every registered + ever-seen host.
channel_ids is the rule’s routing link set. Omit or pass null to leave links unchanged on PUT. Pass an empty array to clear all links → broadcast to every enabled channel (Milestone-A default). Pass IDs to scope the rule to those channels only. The response always includes the current linked IDs.
Alerts — channels
GET /api/alerts/channelsPOST /api/alerts/channelsPUT /api/alerts/channels/{id}DELETE /api/alerts/channels/{id}POST /api/alerts/channels/{id}/testBody for create/update:
{ "name": "Ops ntfy", "type": "ntfy", "config": { "url": "https://ntfy.sh/lumen-alerts", "priority": "high" }, "enabled": true, "min_severity": "warning"}type ∈ ntfy | discord | webhook | telegram. min_severity (info | warning | critical, default info) makes the channel ignore events below that rank. The test action dispatches a synthetic firing notification synchronously and returns { "ok": true } on success or { "ok": "false", "error": "…" } on failure (HTTP 502).
Per-type config shape:
- ntfy –
{ "url": "<topic url>", "priority"?: "min|low|default|high|urgent", "topic"?: "..." } - discord –
{ "url": "<webhook url>" } - webhook –
{ "url": "<post endpoint>" } - telegram –
{ "bot_token": "<BotFather token>", "chat_id": "<numeric or @username>", "parse_mode"?: "HTML|Markdown|MarkdownV2" }
On GET/list, the telegram bot_token is masked to **********; PUT with the mask preserves the stored token, PUT with a real value rotates it.
Alerts — events
GET /api/alerts/events?state=firing|all&limit=100Returns the persisted alert history newest-first. state=firing returns only currently-firing events; state=all includes resolved ones. limit defaults to 100, caps at 500.
Alerts — deliveries
GET /api/alerts/deliveries?status=pending|inflight|sent|failed|dropped&channel_id=N&severity=critical|warning|info&limit=100Returns the persisted notification dispatch log newest-first. Every (event × channel) attempt is one row. Empty filter values = no filter.
Each row:
{ "id": 42, "event_id": 17, "channel_id": 1, "channel_name": "pager", "channel_type": "ntfy", "severity": "critical", "status": "sent", "attempts": 1, "http_status": 200, "error": null, "next_retry_at": null, "payload": { /* the Notification JSON dispatched */ }, "created_at": "2026-05-29T08:30:00Z", "sent_at": "2026-05-29T08:30:01Z"}POST /api/alerts/deliveries/{id}/retryResets a failed or dropped row back to pending with attempts=0 and next_retry_at=NULL. The next dispatcher tick picks it up. Returns 404 if the id is not in a retryable state.
Agent policy
GET /api/agent/policyAuthorization: Bearer lum_REPLACE_ME→ 200 {"collection_interval":"5s"}Agents call this endpoint after successful ticks to pick up runtime policy from the hub. Today the policy contains the effective collection interval from Settings → Runtime. The agent keeps its env/YAML interval as the bootstrap default, then applies this hub policy without a redeploy.
A 401 means the token is invalid or rotated. The response is intentionally small so agents can poll it cheaply.
WebSocket — /api/stream
Open with the session cookie attached:
const ws = new WebSocket("wss://lumen.example.lan/api/stream");ws.onmessage = (e) => { const snapshots = JSON.parse(e.data); // HostSnapshot[]};The hub pushes a snapshot every LUMEN_HUB_STREAM_INTERVAL
(default 5 s). Each frame is the entire array of currently-known
hosts (subject to the subscription filter — see below).
HostSnapshot shape
type HostSnapshot = { host: string; ts: string; // RFC3339 cpu_pct: number; cpu_per_core?: number[]; // live-only ram_pct: number; swap_pct: number; disk_pct: number; load1: number; load5: number; load15: number; net_rx_bps: number; net_tx_bps: number; disk_r_bps: number; disk_w_bps: number; temp_c: number; containers?: ContainerInfo[]; // live-only cpu_series?: number[]; // last ~120 CPU% values, oldest first};
type ContainerInfo = { id: string; name: string; image: string; state: string; cpu_pct: number; mem_used_bytes: number; mem_limit_bytes: number; mem_pct: number;};Subscribe (control frame)
By default a connection receives every host snapshot. Send a control frame to narrow:
// Filter to one host (used by the detail view)ws.send(JSON.stringify({type: "subscribe", hosts: ["webA"]}));
// Revert to firehosews.send(JSON.stringify({type: "subscribe", hosts: ["*"]}));Empty list or no frame ever sent = firehose (Phase 1 dashboard behavior). Unknown control types are ignored.
Error format
Every 4xx / 5xx response from /api/* is JSON:
{"error": "human-readable message"}Validation errors include the offending field name when possible
(e.g. "error":"cpu_pct out of [0,100]"). 401 / 403 messages are
intentionally terse to avoid information leaks.
Versioning
Pre-v0.1.0 the API is /api/* and can break between commits. At
v0.1.0 the stable surface moves to /api/v1/* and the unversioned
paths become aliases for one release. After v0.2.0, unversioned
paths are removed.