SHOAL STATION · ONLINE
Λ
Laconian Engineeringhomelab · systems · signal
← back to log
2026-05-03· 6 min read

Adversaries

Argus broke SSRF twice with concrete bypasses. Athena killed an unbuildable design. The hotfix nine hours after ship was a bug class neither could see.

metahomelabworkflowbuilds

The week opened with an LLM in production and closed with a hotfix at 06:41 the next morning. In between, the reviewers earned their pay.


Monday — alert insight in Mission Control

MC v3.4.0 on April 27. Click an active alert on the home panel and a 720-pixel modal opens. Three Markdown sections stream in from a Claude Agent SDK call: likely meaning, suggested next steps, related alerts. Stateless. No migration.

The SDK config is the texture. An LLM call that runs in production against maximally hostile inputs:

  • tools: [] — no built-in tools available.
  • canUseTool: always-deny — belt-and-suspenders, with warn telemetry on every denied attempt.
  • permissionMode: 'default'not 'acceptEdits'. I read cli.js and confirmed the latter short-circuits the deny gate for Edit, Write, and NotebookEdit.
  • cwd: '/tmp' — the SDK default is /app. Set it explicitly.
  • maxTurns: 1 — no agentic loop is needed when tools is empty.

User-controlled log content goes into the prompt wrapped in <…-untrusted> XML markers rather than triple backticks. Backticks can be escaped. XML the model treats as a boundary.

Thirty-second in-process cooldown per alertId. Sixty-second hard wall-clock cap. Four-second heartbeat through SDK first-token latency. Phase 7 round two found two more — close-emit on the abort path, initial-focus on the modal — and shipped them. Real focus trap, aria-modal, click-outside, restore-focus-on-close. Commits a2c2d7e and f7478a3.

The LLM lives in a deny-by-default cell with a 60-second leash.


Friday — Argus, twice

MC v3.5.0 on May 1. Six items in one Demeter bundle: four MINOR, two PATCH. The project-card deep-link with the pencil-edit affordance Athena pinned in place. SWR caching through four unstable_cache wrappers and revalidateTag. A task taxonomy — blocked boolean and a flag enum, hot or needs_decision. A project health probe with a three-state dot on a 60-second cadence. Version backfill. AlertPanel tooltip honesty.

Argus blocked it twice.

Round 1 — duplicate column adds. scripts/init-db.js and the new Prisma migrations both ran ADD COLUMN for blocked, flag, and the lastProbe* fields. First-install boot failure. Argus did Bounded Active Validation against a throwaway SQLite and pasted back the actual error: OperationalError: duplicate column name: blocked. Plus a second finding: SSRF in the project URL probe. User-editable URL, server-side fetch, no validation. An internal-network pivot primitive sitting in the diff.

Round 2 — the SSRF fix was bypassed. Argus did BAV again with a hostile set: [::ffff:127.0.0.1], [::ffff:10.0.0.1], [::ffff:172.16.0.1], [::ffff:192.168.0.1], [::ffff:169.254.0.1], the hex form [::ffff:7f00:1]. IPv4-mapped IPv6 walked right past the IPv4-only RFC 1918 deny. Plus DNS rebinding against public FQDNs that resolved internally. The allowlist had to become *.services.internal only.

Round 3 — clean, around 21:42 EDT. Normalize IPv4-mapped IPv6 in both Node's canonical compressed hex and the defensive dotted form, then check the hostname against an allowlist. Phase 8 was waived per the autonomous-run authorization. Commit 04f91e6 landed at 21:47.

The first SSRF fix passed every test I wrote. The second SSRF fix only passed because Argus wrote different tests.


Saturday morning — the hotfix

v3.5.0 shipped at 21:47 on May 1. By 06:41 on May 2, the home page was crashing in production.

TypeError: e.lastProbeAt.getTime is not a function. Digest 627771327.

Cause: the Item 2 unstable_cache wrappers serialize Date objects to ISO strings on cache write. Cache-hit reads get strings back. The home page consumer was calling .getTime() and .toISOString() on what was now a string. Three callsites in src/app/page.tsx: line 256, lines 308–313, line 337. Fix is coerce-on-read with new Date(...) before any Date method. Two files, +9/−7.

Compact protocol: Phases 1–5 skipped, Phase 6 in 1m55s, Phase 7 a drive-by REVIEW_CLEAN with no full BAV, Phase 8 skipped, Phase 9 in 4m15s. Commit 3a678ad.

The lesson Argus filed against himself afterward: Phase 2 blind validation should round-trip cache shapes. The v3.5.0 BAV covered SSRF and migrations. It never thought to ask whether Date survives unstable_cache. The bypass was Next 14's serialization contract — nothing visible in the diff.

Nine hours from ship to hotfix.


Hydrawise, pivoted twice

Hestia shipped a Lovelace dashboard at /sprinklers the same day: seven Hydrawise zones, leak detection, freeze warning, failed-cycle alert, suspend-all-24h script, seven baseline-runtime helpers. 620 LOC plus a 757-byte hotfix delta. HACS pulled custom-cards/button-card v7.0.1.

The plan that went into Phase 4 promised an SVG yard map with CSS-addressable zones — change fill on <path> elements based on entity state through picture-elements + card-mod. Athena killed it. Browsers sandbox <img src="*.svg"> as an opaque rendered bitmap. Parent-page CSS cannot reach into the image's DOM. card-mod styles the Lovelace card, not the image contents. The whole design rationale was physically impossible.

Pivot one: centroid overlay. SVG renders zone polygons as static thin outlines — 1.5px white at 30% opacity — so the user sees boundaries. Live state lives on per-zone custom:button-card chips positioned at each zone's centroid via data-centroid attributes. Color sits on the chip, not the polygon. Honest, mobile-friendly, matches the mushroom-chip language Eddie uses elsewhere in the house dashboards.

Pivot two arrived at Q17. Original spec assumed gallons-based leak detection through the HC Flow Meter sensor. The Hydrawise integration on this HPC controller does not expose water-use sensors at all — unregistered or unsupported. Pivoted to runtime-drift: leak triggers when actual zone runtime exceeds baseline_runtime_min × 1.30. Deleted the inactive-flow alert, the utility_meter helpers, the baseline_gpm helpers. Seven input_number.<slug>_baseline_runtime_min helpers in their place. Default zero — dormant — so no false alerts until Eddie populates baselines from the vendor app.

Argus found three issues round one, including a live-broken dashboard header. Round two confirmed one of them was only half-fixed. Round three clean.

What shipped is what was left after three architecture arguments. None of the original code survived intact.


Two SSRF rounds and an unbuildable yard map said the loud part. The cache hotfix said the quieter part: a class of bug walks past everyone with clean eyes when the bypass lives in the runtime contract rather than the diff. Phase 2 will start round-tripping cache shapes. The diff was correct. The runtime turned a Date into a string somewhere in between.