Anton, chapter 3: Domains widen, browser hardens
The week opens on a piece of unfinished business. Doctolib search works. Detail enrichment does not. I want to drill into the appointment detail page from the search results so Anton can tell the family what's actually available, not just that something exists. The first attempt clicks back into the result. The second switches to direct navigation to avoid stale handles. The third adds diagnostic logging, then a 60 second budget, then per-page timeouts. Each fix surfaces the next failure. By mid-morning the answer is unambiguous: Cloudflare is fingerprinting the browser as a bot and blocking the navigation entirely. I disable detail enrichment and file the issue. The cleverness has to move somewhere else.
Persistent profiles
The unlock comes from changing the question. Instead of asking "how do I get into the detail page", I ask "what does a real browser look like that this one doesn't". The answer is persistent profiles. A real Chromium, not the bundled headless one, running in a profile directory that keeps cookies, history, and the small thousand fingerprints that accumulate over time. Once the browser is allowed to act like a browser, Cloudflare stops fingerprinting it. Anti-detection flags help at the margins, but the real fix is identity continuity: a session the site recognizes as a returning human, not a fresh anonymous request from nowhere.
Then the second move, which I like more. I rewrite enrichment to skip the detail page entirely. The list page already contains most of what we want, in unstructured text. So I feed the list page text to a single LLM call and ask for the structured fields back. One call. Cheaper than per-card DOM traversal, faster, and it doesn't fight the site. The lesson worth keeping: when the deterministic path costs you a battle with the host, lift the work up one level and let the model read the raw text. The LLM is the cheapest unit of work I have. I should be using it where it pays.
Lifting work to the LLM
The browser also needs to live somewhere. By midweek it lands in its own service container, dedicated, isolated, with the persistent profile mounted as a volume. CDP wiring takes a handful of commits to settle (a TCP proxy because Chromium binds where it likes, an ESM import detail, the WebSocket URL, stale lock cleanup). When it's done, every browser-touching domain (Doctolib, the syndic site, the consulate, generic web) shares the same hardened browser. One container, one profile per site, one place to fix things.
Alongside the browser saga, the media subsystem grows the features that turn it from a toy into something the family actually uses. The bag of media tools gets redesigned around an intent-driven request shape with four tools (status, library, watch, search). The watchlist gets follow and unfollow and a triage job that scans the catalog and tells you what's worth attention. The validator learns to respect the scope of a request so it stops retrying things that were never in scope. Show triage starts as "latest season, 14 day lookback" and then becomes "the last episode Plex actually has", which is a small but characteristic move: stop reasoning from arbitrary windows, reason from state.
Movie night
The headline media feature is movie night. A scheduled job, Friday at six, that picks two or three movies for the family and posts them to the group. It's the first proactive message Anton sends. Not a reply, not an answer to a question, but an unsolicited suggestion at a fixed time. Six iterations of prompt refinement in two hours to land the tone. The early drafts close with "want me to download any of these?", which nobody asked for, and which makes the message read like a salesman. The scheduled-output rules go into the system prompt that same afternoon: no follow-up suggestions, no filler, follow the spec. A scheduled message arrives on its own terms or it doesn't arrive at all.
Pluggable domains
The 13th's evening commit is the one I'm proudest of structurally. Domain tools get refactored into pluggable modules. Each domain registers itself with a small definition shape; the parent agent's tool surface is built dynamically from whatever modules are present. The parent stops knowing about specific domains. It just knows it has tools. Two days later the syndic domain (the building management portal) gets registered as a new domain in three lines. Three. That's the kind of moment that tells you the abstraction was the right one. When the marginal cost of a new domain drops to nothing, you've found the seam.
Collections substrate
Then the collections substrate. Wines landed earlier as a typed table, the first user-facing collection. By the 15th the pattern is going to repeat: contacts, books, restaurants, anything else the family wants to remember. Rather than write a typed table per collection forever, I land a generic collections table backed by JSONB items, with one set of tools (add, search, update) that works for everything. The trick that makes generic tools actually usable is putting the collection's field schema into the tool description itself. The LLM reads the description, knows what shape "wine" items take versus "restaurant" items, and adapts. Wines get migrated as the first use case. The typed table goes away. One substrate, many collections.
Skills v0 lands the same week. A skills table in the database, an admin command set, a small UI listing, seed data. At this point "skill" means saved prompt: a reusable command template like "weekly review" or "write a Google Doc with this structure". The point is to stop pasting the same long instructions into chat over and over and start treating them as named, reusable, edit-in-the-database artifacts. There's an unfortunate terminology overlap with the typed-function skills package, which I'll have to clean up. For now the value is real: a prompt I want to keep is a row I can edit, not a string I have to track down.
Smaller things that earn their place: a weather skill backed by Open-Meteo, because asking Anton about the weather should not require a detour through web search. Google Docs creation with auto-sharing, because the family already lives in Drive. Collection lifecycle tests added to the quality suite, because everything that ships now ships with regression coverage (the rule from last weekend has become a habit). And a Clara-specific tone in the system prompt: simpler responses, escalate to me when needed. The first user-aware customization. Anton talks differently to different people in the same household, which is what any decent assistant should do.
The week closes with the architecture lighter than it started. The browser is its own container with a real profile and anti-detection that actually works. Domains are pluggable, and adding one is a registration, not a fork. Collections are generic. Skills are data. The instinct underneath all of it is the same one I keep coming back to: when something is going to repeat, make it a substrate, not a special case. Every time I've made that choice this week, the system got smaller and the next feature got cheaper. That's the trade I want to keep making.