Bỏ qua nội dung

API

Nội dung này không tồn tại trong ngôn ngữ của bạn

The hub speaks two protocols:

  • HTTP/JSON under /api/* for control + history + auth
  • WebSocket at /api/stream for live updates

A canonical OpenAPI 3.1 spec lives in the repo at api/openapi.yaml — import it into Postman / Insomnia / Bruno / Hoppscotch for ready-to-use request collections. There’s also a api/lumen.http file for the VS Code REST Client.

Authentication models

Two different schemes, picked per endpoint:

SchemeUsed bySent how
Session cookie (lumen_session)Browser + UI endpointsHS256 JWT in HttpOnly cookie set by /api/login
Bearer token (lum_…)Agent ingest and policyAuthorization: Bearer lum_… header

The session cookie is set with HttpOnly, SameSite=Lax, and Secure (under HTTPS). 30-day TTL. The bearer token is minted per host in Settings → Hosts → Create; the plaintext is shown once and never stored — only its SHA-256 hash lives in the DB.

Endpoints

Health

GET /healthz
→ 200 {"status":"ok"}

No auth. Returns 200 as long as the hub process is up. Use this for container healthchecks and uptime monitors.

Ingest

POST /api/ingest
Authorization: Bearer lum_REPLACE_ME
Content-Type: application/json
{
"host": "ignored-server-overrides",
"ts": "2026-05-26T08:14:00Z",
"cpu_pct": 12.5,
"cpu_per_core": [10.1, 14.3, 12.7, 13.0],
"ram_pct": 63.2,
"swap_pct": 0.0,
"disk_pct": 41.5,
"load1": 0.42,
"load5": 0.51,
"load15": 0.49,
"net_rx_bps": 10240,
"net_tx_bps": 5120,
"disk_r_bps": 0,
"disk_w_bps": 20480,
"temp_c": 48.3,
"containers": [
{
"id": "abc123def456",
"name": "nginx",
"image": "nginx:1.27",
"state": "running",
"cpu_pct": 0.3,
"mem_used_bytes": 78643200,
"mem_limit_bytes": 536870912,
"mem_pct": 14.6
}
]
}
204 No Content

The hub looks up the token, sets req.host from the host record (any value the agent sent in host is overwritten), validates ranges, then stores the snapshot. A 401 means the token is invalid or absent; a 400 means a field is out of range (e.g. cpu_pct > 100).

cpu_per_core and containers are live-only — they flow through the WebSocket but aren’t persisted. The historical /api/hosts/{id}/metrics endpoint returns only the aggregate scalars.

Auth — setup status

GET /api/setup-status
→ 200 {"admin_exists": true}

The bootstrap flag the UI uses to decide between Register and Login. Once an admin exists, register is closed.

Auth — register first admin

POST /api/register
Content-Type: application/json
{"username":"admin","password":""}
201 {"user":{"id":1,"username":"admin","created_at":""}}
+ Set-Cookie: lumen_session=…

One-shot. Returns 409 if an admin already exists. Password is Argon2id-hashed; minimum 8 chars.

Auth — login

POST /api/login
{"username":"admin","password":""}
200 {"id":1,"username":"admin","created_at":""}
+ Set-Cookie: lumen_session=…

Returns 401 on wrong credentials. The body intentionally doesn’t distinguish “no such user” from “bad password” — same response either way.

Auth — logout / me

POST /api/logout
→ 204 (idempotent)
GET /api/me
→ 200 {"id":1,"username":"admin","created_at":""}
401 if no/expired session cookie

Auth — change password

POST /api/account/password
Cookie: lumen_session=…
{"current":"","new":""}
204
401 if current password is wrong
400 if new password is too short (<8)

Rehashes with Argon2id. The session cookie stays valid (we don’t force a re-login on password change).

Hosts — list

GET /api/hosts
Cookie: lumen_session=…
→ 200 [{"id":1,"name":"webA","created_at":"","last_seen_at":""}, ]

last_seen_at updates on every successful ingest; null until the host first checks in.

Hosts — create

POST /api/hosts
{"name":"webA"}
201 {"host":{"id":1,"name":"webA",},"token":"lum_…"}

The bearer token is returned once — copy it now or rotate. The DB only stores its SHA-256 hash.

Hosts — rotate token

POST /api/hosts/{id}/rotate
→ 200 {"token":"lum_…"}

Invalidates the previous token; the existing agent will start returning 401 until you re-deploy.

Hosts — delete

DELETE /api/hosts/{id}
→ 204

Also evicts that host’s in-memory snapshot so the dashboard stops showing a ghost card.

Hosts — set tags

PUT /api/hosts/{id}/tags
{ "tags": { "tier": "prod", "env": "prod", "team": "ops" } }
200 { "tags": { "tier": "prod", "env": "prod", "team": "ops" } }

Replaces the host’s tag set wholesale (no patch semantics). Empty map clears all tags. Keys may contain letters/digits/-_., max 64 chars; values max 128 chars, may not contain = or ,. Up to 32 tags per host. Tags surface on GET /api/hosts (tags field on each host) and feed alert rule host_selector.

Hosts — metric history

GET /api/hosts/{id}/metrics?from=2026-05-25T08:00:00Z&to=2026-05-26T08:00:00Z&step=60
→ 200 {
"host": "webA",
"from": "2026-05-25T08:00:00Z",
"to": "2026-05-26T08:00:00Z",
"step_seconds": 60,
"points": [
{"ts":"2026-05-25T08:00:00Z","cpu_pct":12.5,"ram_pct":63.2,},
]
}

Server-side AVG bucketing on the idx_snapshots_host_ts index. Limits:

  • Window ≤ 7 days (rejected with 400 otherwise)
  • step ≥ 5 s
  • Maximum 2000 points per response (auto-step picks ~120 buckets if omitted)

Fields returned match the persisted scalars (no cpu_per_core, containers, or cpu_series).

Version

GET /api/version
Cookie: lumen_session=…
→ 200 {
"hub_version": "v0.2.0",
"latest_agent_version": "v0.2.0"
}

The running hub’s build version. Because the hub and agent ship from the same release train, latest_agent_version mirrors hub_version — it is the newest agent a host could be running. The web UI compares each host’s reported system.agent_version (sent in every ingest) against this and shows an update-available badge when a host’s agent is behind. Source builds (no -ldflags) report "dev", which suppresses the badge to avoid noise.

Settings — get / put

GET /api/settings
→ 200 {
"retention_window":"24h",
"retention_interval":"1h",
"agent_interval":"5s",
"downsample_bucket_size":"5m",
"downsample_hot_window":"24h",
"downsample_archive_window":"8760h"
}
PUT /api/settings
{"retention_window":"6h","downsample_bucket_size":"10m"}
200 {
"retention_window":"6h",
"retention_interval":"1h",
"agent_interval":"5s",
"downsample_bucket_size":"10m",
"downsample_hot_window":"24h",
"downsample_archive_window":"8760h"
}

Bounds:

KeyRange
retention_window5 m – 365 d (or 0 to disable)
retention_interval1 m – 24 h (or 0 to disable)
agent_interval2 s – 1 h
downsample_bucket_size1 m – 24 h
downsample_hot_window1 h – 30 d
downsample_archive_window1 d – 365 d

The downsample values configure the future Parquet cold tier: bucket size is the time span represented by one archived point (5m averages old samples into one point every 5 minutes), hot window is how long full-detail raw SQLite rows are kept (24h keeps every sample for the last day), and archive window is how long compressed history is kept (8760h is about one year). Out-of-range or unparseable durations return 400. UI edits propagate to the retention loop within 30 s for retention fields; downsample fields are stored now and consumed once cold-tier compaction lands.

Alerts — rules

GET /api/alerts/rules
POST /api/alerts/rules
PUT /api/alerts/rules/{id}
DELETE /api/alerts/rules/{id}

Body for create/update:

{
"name": "CPU hot",
"metric": "cpu_pct",
"comparator": "gt",
"threshold": 80,
"for_seconds": 60,
"host": "web-1,db-1",
"host_selector": "tier=prod,env=prod",
"severity": "warning",
"enabled": true,
"channel_ids": [1, 3]
}

metriccpu_pct | ram_pct | swap_pct | disk_pct | load1 | offline. The offline metric ignores comparator/threshold and clamps for_seconds up to a 60 s minimum.

Host targeting precedence (first non-empty wins):

  1. host_selector — comma-separated key=value pairs. All must match a host’s tag set. Bare key (no =) matches when the tag exists with empty value.
  2. host — empty (all), exact name, comma list (web-1,db-1), or glob (web-*, *-prod, db-[0-9]*) using path.Match semantics. Comma list = OR across segments; each segment may itself be exact or glob.
  3. Both empty → every registered + ever-seen host.

channel_ids is the rule’s routing link set. Omit or pass null to leave links unchanged on PUT. Pass an empty array to clear all links → broadcast to every enabled channel (Milestone-A default). Pass IDs to scope the rule to those channels only. The response always includes the current linked IDs.

Alerts — channels

GET /api/alerts/channels
POST /api/alerts/channels
PUT /api/alerts/channels/{id}
DELETE /api/alerts/channels/{id}
POST /api/alerts/channels/{id}/test

Body for create/update:

{
"name": "Ops ntfy",
"type": "ntfy",
"config": { "url": "https://ntfy.sh/lumen-alerts", "priority": "high" },
"enabled": true,
"min_severity": "warning"
}

typentfy | discord | webhook | telegram. min_severity (info | warning | critical, default info) makes the channel ignore events below that rank. The test action dispatches a synthetic firing notification synchronously and returns { "ok": true } on success or { "ok": "false", "error": "…" } on failure (HTTP 502).

Per-type config shape:

  • ntfy{ "url": "<topic url>", "priority"?: "min|low|default|high|urgent", "topic"?: "..." }
  • discord{ "url": "<webhook url>" }
  • webhook{ "url": "<post endpoint>" }
  • telegram{ "bot_token": "<BotFather token>", "chat_id": "<numeric or @username>", "parse_mode"?: "HTML|Markdown|MarkdownV2" }

On GET/list, the telegram bot_token is masked to **********; PUT with the mask preserves the stored token, PUT with a real value rotates it.

Alerts — events

GET /api/alerts/events?state=firing|all&limit=100

Returns the persisted alert history newest-first. state=firing returns only currently-firing events; state=all includes resolved ones. limit defaults to 100, caps at 500.

Alerts — deliveries

GET /api/alerts/deliveries?status=pending|inflight|sent|failed|dropped&channel_id=N&severity=critical|warning|info&limit=100

Returns the persisted notification dispatch log newest-first. Every (event × channel) attempt is one row. Empty filter values = no filter.

Each row:

{
"id": 42,
"event_id": 17,
"channel_id": 1,
"channel_name": "pager",
"channel_type": "ntfy",
"severity": "critical",
"status": "sent",
"attempts": 1,
"http_status": 200,
"error": null,
"next_retry_at": null,
"payload": { /* the Notification JSON dispatched */ },
"created_at": "2026-05-29T08:30:00Z",
"sent_at": "2026-05-29T08:30:01Z"
}
POST /api/alerts/deliveries/{id}/retry

Resets a failed or dropped row back to pending with attempts=0 and next_retry_at=NULL. The next dispatcher tick picks it up. Returns 404 if the id is not in a retryable state.

Agent policy

GET /api/agent/policy
Authorization: Bearer lum_REPLACE_ME
→ 200 {"collection_interval":"5s"}

Agents call this endpoint after successful ticks to pick up runtime policy from the hub. Today the policy contains the effective collection interval from Settings → Runtime. The agent keeps its env/YAML interval as the bootstrap default, then applies this hub policy without a redeploy.

A 401 means the token is invalid or rotated. The response is intentionally small so agents can poll it cheaply.

WebSocket — /api/stream

Open with the session cookie attached:

const ws = new WebSocket("wss://lumen.example.lan/api/stream");
ws.onmessage = (e) => {
const snapshots = JSON.parse(e.data); // HostSnapshot[]
};

The hub pushes a snapshot every LUMEN_HUB_STREAM_INTERVAL (default 5 s). Each frame is the entire array of currently-known hosts (subject to the subscription filter — see below).

HostSnapshot shape

type HostSnapshot = {
host: string;
ts: string; // RFC3339
cpu_pct: number;
cpu_per_core?: number[]; // live-only
ram_pct: number;
swap_pct: number;
disk_pct: number;
load1: number; load5: number; load15: number;
net_rx_bps: number; net_tx_bps: number;
disk_r_bps: number; disk_w_bps: number;
temp_c: number;
containers?: ContainerInfo[]; // live-only
cpu_series?: number[]; // last ~120 CPU% values, oldest first
};
type ContainerInfo = {
id: string; name: string; image: string; state: string;
cpu_pct: number;
mem_used_bytes: number; mem_limit_bytes: number; mem_pct: number;
};

Subscribe (control frame)

By default a connection receives every host snapshot. Send a control frame to narrow:

// Filter to one host (used by the detail view)
ws.send(JSON.stringify({type: "subscribe", hosts: ["webA"]}));
// Revert to firehose
ws.send(JSON.stringify({type: "subscribe", hosts: ["*"]}));

Empty list or no frame ever sent = firehose (Phase 1 dashboard behavior). Unknown control types are ignored.

Error format

Every 4xx / 5xx response from /api/* is JSON:

{"error": "human-readable message"}

Validation errors include the offending field name when possible (e.g. "error":"cpu_pct out of [0,100]"). 401 / 403 messages are intentionally terse to avoid information leaks.

Versioning

Pre-v0.1.0 the API is /api/* and can break between commits. At v0.1.0 the stable surface moves to /api/v1/* and the unversioned paths become aliases for one release. After v0.2.0, unversioned paths are removed.