API

Nội dung này không tồn tại trong ngôn ngữ của bạn

The hub speaks two protocols:

HTTP/JSON under /api/* for control + history + auth
WebSocket at /api/stream for live updates

A canonical OpenAPI 3.1 spec lives in the repo at api/openapi.yaml — import it into Postman / Insomnia / Bruno / Hoppscotch for ready-to-use request collections. There’s also a api/lumen.http file for the VS Code REST Client.

Authentication models

Two different schemes, picked per endpoint:

Scheme	Used by	Sent how
Session cookie (`lumen_session`)	Browser + UI endpoints	HS256 JWT in HttpOnly cookie set by `/api/login`
Bearer token (`lum_…`)	Agent ingest and policy	`Authorization: Bearer lum_…` header

The session cookie is set with HttpOnly, SameSite=Lax, and Secure (under HTTPS). 30-day TTL. The bearer token is minted per host in Settings → Hosts → Create; the plaintext is shown once and never stored — only its SHA-256 hash lives in the DB.

Endpoints

Health

GET /healthz
→ 200 {"status":"ok"}

No auth. Returns 200 as long as the hub process is up. Use this for container healthchecks and uptime monitors.

Ingest

POST /api/ingest
Authorization: Bearer lum_REPLACE_ME
Content-Type: application/json

{
  "host": "ignored-server-overrides",
  "ts":   "2026-05-26T08:14:00Z",
  "cpu_pct":     12.5,
  "cpu_per_core": [10.1, 14.3, 12.7, 13.0],
  "ram_pct":     63.2,
  "swap_pct":     0.0,
  "disk_pct":    41.5,
  "load1":        0.42,
  "load5":        0.51,
  "load15":       0.49,
  "net_rx_bps":  10240,
  "net_tx_bps":   5120,
  "disk_r_bps":      0,
  "disk_w_bps":  20480,
  "temp_c":      48.3,
  "containers": [
    {
      "id": "abc123def456",
      "name": "nginx",
      "image": "nginx:1.27",
      "state": "running",
      "cpu_pct": 0.3,
      "mem_used_bytes":  78643200,
      "mem_limit_bytes": 536870912,
      "mem_pct": 14.6
    }
  ]
}
→ 204 No Content

The hub looks up the token, sets req.host from the host record (any value the agent sent in host is overwritten), validates ranges, then stores the snapshot. A 401 means the token is invalid or absent; a 400 means a field is out of range (e.g. cpu_pct > 100).

cpu_per_core and containers are live-only — they flow through the WebSocket but aren’t persisted. The historical /api/hosts/{id}/metrics endpoint returns only the aggregate scalars.

Auth — setup status

GET /api/setup-status
→ 200 {"admin_exists": true}

The bootstrap flag the UI uses to decide between Register and Login. Once an admin exists, register is closed.

Auth — register first admin

POST /api/register
Content-Type: application/json
{"username":"admin","password":"…"}
→ 201 {"user":{"id":1,"username":"admin","created_at":"…"}}
  + Set-Cookie: lumen_session=…

One-shot. Returns 409 if an admin already exists. Password is Argon2id-hashed; minimum 8 chars.

POST /api/login
{"username":"admin","password":"…"}
→ 200 {"id":1,"username":"admin","created_at":"…"}
  + Set-Cookie: lumen_session=…

Returns 401 on wrong credentials. The body intentionally doesn’t distinguish “no such user” from “bad password” — same response either way.

Auth — logout / me

POST /api/logout
→ 204  (idempotent)

GET /api/me
→ 200 {"id":1,"username":"admin","created_at":"…"}
→ 401 if no/expired session cookie

Auth — change password

POST /api/account/password
Cookie: lumen_session=…
{"current":"…","new":"…"}
→ 204
→ 401 if current password is wrong
→ 400 if new password is too short (<8)

Rehashes with Argon2id. The session cookie stays valid (we don’t force a re-login on password change).

Hosts — list

GET /api/hosts
Cookie: lumen_session=…
→ 200 [{"id":1,"name":"webA","created_at":"…","last_seen_at":"…"}, …]

last_seen_at updates on every successful ingest; null until the host first checks in.

Hosts — create

POST /api/hosts
{"name":"webA"}
→ 201 {"host":{"id":1,"name":"webA",…},"token":"lum_…"}

The bearer token is returned once — copy it now or rotate. The DB only stores its SHA-256 hash.

Hosts — rotate token

POST /api/hosts/{id}/rotate
→ 200 {"token":"lum_…"}

Invalidates the previous token; the existing agent will start returning 401 until you re-deploy.

Hosts — delete

DELETE /api/hosts/{id}
→ 204

Also evicts that host’s in-memory snapshot so the dashboard stops showing a ghost card.

Hosts — set tags

PUT /api/hosts/{id}/tags
{ "tags": { "tier": "prod", "env": "prod", "team": "ops" } }
→ 200 { "tags": { "tier": "prod", "env": "prod", "team": "ops" } }

Replaces the host’s tag set wholesale (no patch semantics). Empty map clears all tags. Keys may contain letters/digits/-_., max 64 chars; values max 128 chars, may not contain = or ,. Up to 32 tags per host. Tags surface on GET /api/hosts (tags field on each host) and feed alert rule host_selector.

Hosts — metric history

GET /api/hosts/{id}/metrics?from=2026-05-25T08:00:00Z&to=2026-05-26T08:00:00Z&step=60
→ 200 {
    "host": "webA",
    "from": "2026-05-25T08:00:00Z",
    "to":   "2026-05-26T08:00:00Z",
    "step_seconds": 60,
    "points": [
      {"ts":"2026-05-25T08:00:00Z","cpu_pct":12.5,"ram_pct":63.2,…},
      …
    ]
  }

Server-side AVG bucketing on the idx_snapshots_host_ts index. Limits:

Window ≤ 7 days (rejected with 400 otherwise)
step ≥ 5 s
Maximum 2000 points per response (auto-step picks ~120 buckets if omitted)

Fields returned match the persisted scalars (no cpu_per_core, containers, or cpu_series).

Version

GET /api/version
Cookie: lumen_session=…
→ 200 {
    "hub_version": "v0.2.0",
    "latest_agent_version": "v0.2.0"
  }

The running hub’s build version. Because the hub and agent ship from the same release train, latest_agent_version mirrors hub_version — it is the newest agent a host could be running. The web UI compares each host’s reported system.agent_version (sent in every ingest) against this and shows an update-available badge when a host’s agent is behind. Source builds (no -ldflags) report "dev", which suppresses the badge to avoid noise.

Settings — get / put

GET /api/settings
→ 200 {
  "retention_window":"24h",
  "retention_interval":"1h",
  "agent_interval":"5s",
  "downsample_bucket_size":"5m",
  "downsample_hot_window":"24h",
  "downsample_archive_window":"8760h"
}

PUT /api/settings
{"retention_window":"6h","downsample_bucket_size":"10m"}
→ 200 {
  "retention_window":"6h",
  "retention_interval":"1h",
  "agent_interval":"5s",
  "downsample_bucket_size":"10m",
  "downsample_hot_window":"24h",
  "downsample_archive_window":"8760h"
}

Bounds:

Key	Range
`retention_window`	5 m – 365 d (or `0` to disable)
`retention_interval`	1 m – 24 h (or `0` to disable)
`agent_interval`	2 s – 1 h
`downsample_bucket_size`	1 m – 24 h
`downsample_hot_window`	1 h – 30 d
`downsample_archive_window`	1 d – 365 d

The downsample values configure the future Parquet cold tier: bucket size is the time span represented by one archived point (5m averages old samples into one point every 5 minutes), hot window is how long full-detail raw SQLite rows are kept (24h keeps every sample for the last day), and archive window is how long compressed history is kept (8760h is about one year). Out-of-range or unparseable durations return 400. UI edits propagate to the retention loop within 30 s for retention fields; downsample fields are stored now and consumed once cold-tier compaction lands.

Alerts — rules

GET    /api/alerts/rules
POST   /api/alerts/rules
PUT    /api/alerts/rules/{id}
DELETE /api/alerts/rules/{id}

Body for create/update:

{
  "name": "CPU hot",
  "metric": "cpu_pct",
  "comparator": "gt",
  "threshold": 80,
  "for_seconds": 60,
  "host": "web-1,db-1",
  "host_selector": "tier=prod,env=prod",
  "severity": "warning",
  "enabled": true,
  "channel_ids": [1, 3]
}

Host targeting precedence (first non-empty wins):

host_selector — comma-separated key=value pairs. All must match a host’s tag set. Bare key (no =) matches when the tag exists with empty value.
host — empty (all), exact name, comma list (web-1,db-1), or glob (web-*, *-prod, db-[0-9]*) using path.Match semantics. Comma list = OR across segments; each segment may itself be exact or glob.
Both empty → every registered + ever-seen host.

channel_ids is the rule’s routing link set. Omit or pass null to leave links unchanged on PUT. Pass an empty array to clear all links → broadcast to every enabled channel (Milestone-A default). Pass IDs to scope the rule to those channels only. The response always includes the current linked IDs.

Alerts — channels

GET    /api/alerts/channels
POST   /api/alerts/channels
PUT    /api/alerts/channels/{id}
DELETE /api/alerts/channels/{id}
POST   /api/alerts/channels/{id}/test

Body for create/update:

{
  "name": "Ops ntfy",
  "type": "ntfy",
  "config": { "url": "https://ntfy.sh/lumen-alerts", "priority": "high" },
  "enabled": true,
  "min_severity": "warning"
}

type ∈ ntfy | discord | webhook | telegram. min_severity (info | warning | critical, default info) makes the channel ignore events below that rank. The test action dispatches a synthetic firing notification synchronously and returns { "ok": true } on success or { "ok": "false", "error": "…" } on failure (HTTP 502).

Per-type config shape:

ntfy – { "url": "<topic url>", "priority"?: "min|low|default|high|urgent", "topic"?: "..." }
discord – { "url": "<webhook url>" }
webhook – { "url": "<post endpoint>" }
telegram – { "bot_token": "<BotFather token>", "chat_id": "<numeric or @username>", "parse_mode"?: "HTML|Markdown|MarkdownV2" }

On GET/list, the telegram bot_token is masked to **********; PUT with the mask preserves the stored token, PUT with a real value rotates it.

Alerts — events

GET /api/alerts/events?state=firing|all&limit=100

Returns the persisted alert history newest-first. state=firing returns only currently-firing events; state=all includes resolved ones. limit defaults to 100, caps at 500.

Alerts — deliveries

GET /api/alerts/deliveries?status=pending|inflight|sent|failed|dropped&channel_id=N&severity=critical|warning|info&limit=100

Returns the persisted notification dispatch log newest-first. Every (event × channel) attempt is one row. Empty filter values = no filter.

Each row:

{
  "id": 42,
  "event_id": 17,
  "channel_id": 1,
  "channel_name": "pager",
  "channel_type": "ntfy",
  "severity": "critical",
  "status": "sent",
  "attempts": 1,
  "http_status": 200,
  "error": null,
  "next_retry_at": null,
  "payload": { /* the Notification JSON dispatched */ },
  "created_at": "2026-05-29T08:30:00Z",
  "sent_at":    "2026-05-29T08:30:01Z"
}

POST /api/alerts/deliveries/{id}/retry

Resets a failed or dropped row back to pending with attempts=0 and next_retry_at=NULL. The next dispatcher tick picks it up. Returns 404 if the id is not in a retryable state.

Agent policy

GET /api/agent/policy
Authorization: Bearer lum_REPLACE_ME
→ 200 {"collection_interval":"5s"}

Agents call this endpoint after successful ticks to pick up runtime policy from the hub. Today the policy contains the effective collection interval from Settings → Runtime. The agent keeps its env/YAML interval as the bootstrap default, then applies this hub policy without a redeploy.

A 401 means the token is invalid or rotated. The response is intentionally small so agents can poll it cheaply.

WebSocket — `/api/stream`

Open with the session cookie attached:

const ws = new WebSocket("wss://lumen.example.lan/api/stream");
ws.onmessage = (e) => {
  const snapshots = JSON.parse(e.data); // HostSnapshot[]
};

The hub pushes a snapshot every LUMEN_HUB_STREAM_INTERVAL (default 5 s). Each frame is the entire array of currently-known hosts (subject to the subscription filter — see below).

HostSnapshot shape

type HostSnapshot = {
  host: string;
  ts: string;          // RFC3339
  cpu_pct: number;
  cpu_per_core?: number[];  // live-only
  ram_pct: number;
  swap_pct: number;
  disk_pct: number;
  load1: number; load5: number; load15: number;
  net_rx_bps: number; net_tx_bps: number;
  disk_r_bps: number; disk_w_bps: number;
  temp_c: number;
  containers?: ContainerInfo[];  // live-only
  cpu_series?: number[];         // last ~120 CPU% values, oldest first
};

type ContainerInfo = {
  id: string; name: string; image: string; state: string;
  cpu_pct: number;
  mem_used_bytes: number; mem_limit_bytes: number; mem_pct: number;
};

By default a connection receives every host snapshot. Send a control frame to narrow:

// Filter to one host (used by the detail view)
ws.send(JSON.stringify({type: "subscribe", hosts: ["webA"]}));

// Revert to firehose
ws.send(JSON.stringify({type: "subscribe", hosts: ["*"]}));

Empty list or no frame ever sent = firehose (Phase 1 dashboard behavior). Unknown control types are ignored.

Error format

Every 4xx / 5xx response from /api/* is JSON:

{"error": "human-readable message"}

Validation errors include the offending field name when possible (e.g. "error":"cpu_pct out of [0,100]"). 401 / 403 messages are intentionally terse to avoid information leaks.

Versioning

Pre-v0.1.0 the API is /api/* and can break between commits. At v0.1.0 the stable surface moves to /api/v1/* and the unversioned paths become aliases for one release. After v0.2.0, unversioned paths are removed.

API