# Gotchas GROWS every migration. One entry per surprise. Also the home for any **written exception** to the scorecard ratchet (a metric that regressed for a documented reason). --- ## Stale `$refs` / `tippy` after a swap A whole-form swap can run an Alpine event handler *before* the component re-initialises, so a handler that dereferences `$refs.input.__x_tippy` or calls `tippy.show()` throws. **Always null-guard:** `$refs.input?.__x_tippy?.hide()`, `tippy?.show()`. The `transaction-edit-swap.spec.ts` `trackErrors()` helper fails the test on any `pageerror` or `console.error`, which is exactly how a stale-ref throw surfaces. ## Let the server value win — don't preserve Alpine state across a server-driven change When a server change should update a component (e.g. choosing a vendor sets its default account), rebuild that section fresh on the swap so the server-provided value lands without keying tricks. The bug this prevents: "changing the vendor a *second* time doesn't update the account" because preserved Alpine state shadowed the new server value. If you *must* preserve a component, key it by value so a change forces re-init: `(assoc attrs :key (str id "--" current-value))`. ## Focus dies if the typed input is inside its own swapped region The single most important invariant. Amount field → swap a sibling tbody, not the row. Memo → swap nothing. If a caret test (`sameNode`) fails, the input is in its own swap region — re-target to a sibling/ancestor that excludes it. ## Faked cursors breed duplicate render fns A `with-cursor`/`MapCursor` re-root to fake a deep start forces a `*-no-cursor*` twin. Removing the fake lets you delete the twin. Don't "fix" a faked cursor in place — top-root it and collapse to one render fn. (See `render-functions.md`.) ## Edit Clojure with clojure-mcp tools, not the file editor `clojure_edit` / `clojure_edit_replace_sexp`. If a file won't compile: `clj-paren-repair` the file, then retry; if still broken, `lein cljfmt check`. Run tests via `clojure-eval` / `clj-nrepl-eval -p PORT`, never `lein test` (slow, last resort). ## Solr/typeahead in tests Account/vendor search is backed by Solr, unavailable in tests. To drive a typeahead in e2e: type under the 3-char threshold, then inject a result into Alpine state (`Alpine.$data(el).elements = [{value, label}]`) and click it — the real click handler, `tippy.hide()`, Alpine reactivity, and the HTMX swap all run as in production. Entity ids come from `GET /test-info`. --- ## UI-only control fields must be stripped before a Datomic upsert The wizard snapshot/step-params carry UI control fields that are **not** schema attributes — `:action`, `:amount-mode`, and (added by the simple/advanced work) `:mode`. The `:manual` save handler stripped `:action`/`:amount-mode` but not `:mode`, so every *advanced* manual save passed `:mode "advanced"` into `:upsert-transaction` and 500'd with `:db.error/not-an-entity :mode`. Lesson: when a save derives its tx-data from the form snapshot, **strip every non-schema control key** before transacting. The session-backed wizard engine (Phase 6) avoids this class of bug by storing per-step *validated* data only — UI control fields never enter the combined data. This was a real production bug surfaced by the e2e gate, not a test artifact. ## E2E helpers must use the Alpine **v3** API, not the v2 `__x` internal The app loads Alpine v3 (`cdn.jsdelivr.net/npm/alpinejs@3.x.x`). The v2 internal `el.__x.$data` is **gone** — `el.__x` is `undefined`, so any helper that pokes it silently no-ops. A stale `selectAccountFromTypeahead` did this and left the posted account empty (account-controlled by `x-model`, so the raw DOM `.value` you set is overwritten from Alpine's empty state). Drive components the real way instead: `window.Alpine.$data(el)`, open the tippy dropdown, inject `elements`, click the result — exactly as `transaction-edit-swap.spec.ts` does. Probe with `{ hasLegacy__x: !!el.__x, hasAlpineData: !!window.Alpine.$data(el) }`. ## Diagnosing a "modal won't close after save" The edit modal closes on an `hx-trigger: modalclose` from a *successful* save; a validation failure re-renders the `#wizard-form` (200), and a server exception returns 500 (caught by `wrap-error`). To find which: capture POST responses in Playwright (`page.on('response', …)`), read the `edit-submit` body — a `
` means validation re-render; a `#error {…}` stack means a 500. Then serialize the form right before save (`new FormData(document.querySelector('#wizard-form'))`) to see exactly what posts. This is how the `:mode` 500 and the empty-account bugs above were isolated. ## De-faking a cursor is not a drop-in — `with-field-default` mutates Tempting fix for a faked deep cursor (`with-cursor` + synthetic `MapCursor` at index 0): replace it with `(fc/with-field-default 0 {})` to advance naturally. **It broke the simple-mode swap** (`transaction-edit-swap` test 1 threw). `with-field-default` calls `cursor/transact!` — it *mutates the form cursor* (assoc-ing the default row) as a render side effect, which changes simple-mode behavior. The read-only synthetic `MapCursor` did not. Lesson: removing a faked cursor on these modals is **not** a one-liner — it's part of the larger render-fn extraction (render the row from explicit data, construct field names directly, look up errors explicitly), done when the simple/advanced rows are reworked into pure render fns / Selmer. Don't swap one cursor primitive for another and assume parity; verify against the swap spec, and expect the de-fake to come with the render-fn rewrite. ## Snapshot operations read stale state and drop live form values (heuristic 2) The whole-form operation handlers (`apply-new-account`, `apply-remove-account`, `apply-toggle-amount-mode`) rebuild the account rows from the **decoded `:snapshot`** (the hidden EDN field), not from the live posted `:step-params`. So any value the user has typed but that hasn't been re-serialised into the snapshot yet — e.g. an amount typed right before clicking "New account" — is **silently lost** when the operation re-renders. This is the snapshot round-trip fragility the migration removes (heuristic 2: → 0 merges; state should ride in the form, not a parallel snapshot). It bit the percentage-split e2e: typing 50% then adding a row reverted the first row to its snapshot value, yielding a 66.67/33.33 split. Two ways it shows up and how to handle until the snapshot is gone: **Fixed (Stage 1):** the operation handlers read the live `:step-params` rows (already schema-decoded by `mm/wrap-wizard`) so typed values survive add/remove/toggle. **Done (Stage 2 — the snapshot round-trip is gone).** The EDN `snapshot` hidden field + custom readers + `merge-multi-form-state` are removed. A `db/id` hidden rides in the form; `wrap-derive-state` rebuilds `:multi-form-state` per request from `entity ∪ step-params`, and `EditWizard.render-wizard` renders a plain form (no snapshot/edit-path/current-step hiddens). The ~34 `:snapshot` reads still work — `:snapshot` is now a derived map, not a round-tripped blob. **Trap that cost hours — derive `entity ∪ step-params` correctly.** First cut was `(merge base step-params)`. Bug: `base` always carries the entity's *persisted* accounts, so after the user removes every row (step-params has no accounts key) the merge falls back to base → the persisted accounts **resurrect** on the next operation. Fix: editable fields (accounts, vendor, memo, approval, action, mode, amount-mode) come **only** from the live form (absent = cleared); only entity-only fields (`db/id`, client, amount, description, status, type) come from the entity. Lesson: with a posted form, "field absent" means *cleared*, not "use the persisted value" — never merge the entity's editable fields back in. **Verify the snapshot removal on a FRESH server, and don't trust a long-lived in-process test server.** Protocol/defrecord (`EditWizard.render-wizard`) and middleware reloads do **not** fully take in a running REPL — the server kept rendering the old snapshot field after `:reload`, and an in-process server that isn't reseeded between `npx playwright` invocations accumulates state that makes order-dependent tests flake. Both produced hours of phantom failures. Restart the REPL clean (or reseed) before trusting an e2e result; CI boots a fresh server per run, so the fresh-server number (38 pass / 1 unrelated) is the real one. ## Characterization tests rot against table order and removed wizard chrome Two stale-test traps surfaced once the masking failure was fixed (a `mode: 'serial'` file hides every test after the first failure, so fixing one unmasks the next): - **Hard-coded amounts per table row index** (`openEditModal(page, 3)` then `expect(amount).toBeCloseTo(400)`) break because same-date seed transactions have no pinned row order. Read the actual value (e.g. the grid's `.account-grand-total-row`) instead of hard-coding. - **Helpers that navigate the old multi-step wizard** (`click('button:has-text("Transaction Actions")')`) hang once the modal is single-page. Drop the navigation; the action tabs are present immediately. ## Flat decode leaks stray form fields into the saved entity (the `method` 500) Dropping the wizard's `step-params[...]` field-name prefix and decoding posted params **straight into the form schema** means the decode now captures **every** posted field, not just the namespaced ones. A single stray field breaks the save: - The tab switcher is `(com/button-group {:name "method"} …)`, which emits ``. Under the wizard, `method` lived *outside* `step-params[...]` so it never entered the decoded map. After the rename it decodes to `:method ""` (malli `:map` is open and passes unknown keys), rides into `snapshot` → `tx-data`, and `:upsert-transaction` rejects it → **HTTP 500 on save**. - Symptom: the save POST fires (confirm with a `println` in the submit handler) but the modal never closes, because the 500 trips `htmx:response-error`. The server error may go to mulog, not stdout — an empty stdout log does **not** mean "no error." Reproduce the exact POST with `curl` (add/remove one field) to isolate the offender fast. **Fix:** strip the decoded map to the schema's known top-level keys before threading on (`select-keys decoded edit-form-keys`); keep that allowlist next to the schema. Nested account sub-maps decode fine — only the top level needs the guard. ## REPL reload does not refresh a running jetty's routes — restart the JVM `handler/match->handler-lookup` is a top-level `def` capturing `(merge ssr/key->handler …)` at load, through a chain of module-level `def`s (`edit` → `ssr.transaction` → `ssr.core` → `handler`). Reloading the leaf `edit.clj` updates it but **not** the captured merges, and a jetty started `(run-jetty app …)` holds a static `app` that doesn't re-deref the lookup per request. Net: after a handler/route/record change, an already-running dev server keeps serving the **old** code — `curl` shows the pre-change response (e.g. the old wizard transitioner) while your REPL renders the new one. **Restart in a fresh JVM** for route/record/middleware changes. For e2e, the Playwright test server (`lein run -m auto-ap.test-server`) is a fresh JVM compiling from disk — but kill any stale `:3333` first (`reuseExistingServer` reuses it), and kill **by port** (`ss -tlnp | grep :3333`), never `pkill -f test-server` (it matches its own command line). ## Full-suite e2e flakes are shared-seed interference The test server seeds once at boot; edit tests **save** (mutate) those seed transactions. Run in parallel, workers race the same rows and earlier saves pollute later reads → phantom failures that pass in isolation. **Proper fix (landed on `staging`, adopted at the rebase):** a `/test-reset` endpoint (`test_server.clj` → `reset-test-data!` recreates + re-seeds the in-memory db) called from a `test.beforeEach` in each spec, plus `fullyParallel: false` + `workers: 1` in `playwright.config.ts`. Every test starts from the same deterministic dataset regardless of run order. This **supersedes** the earlier `--workers=1`-only workaround (which kept order dependence; it merely serialized the races instead of eliminating cross-test state). Post-adoption baseline is **39 pass / 0 fail** — the previously-flaky `transaction-navigation.spec.ts` date-range test is now green, because `/test-reset` removes the residual mutation it was tripping over. ## A value-bound typeahead hidden goes stale across a whole-form swap unless keyed A typeahead (`sc/typeahead`) posts its value through a hidden `` whose DOM `.value` is set by Alpine, not by the server-rendered static `value` attr. After a **whole-form `outerHTML` swap** that re-renders the typeahead, Alpine may preserve the *previous* component's empty `.value` instead of binding the new server value — so the field posts blank on the next submit. Fix: pass **`:id`** to `sc/typeahead` (the account typeahead already does). `:id` makes the wrapper emit `:key (str id "--" value)`, and the value-keyed `:key` forces a clean Alpine re-init that lands the server value. The bulk-code *vendor* typeahead hit this (account rows didn't, because they pass `:id`) — symptom: "vendor not preserved on a validation re-render." Note the testing trap: reading the hidden's `.value` in isolation (`inputValue()` / `toHaveValue`) is an unreliable probe — it lags Alpine. Assert what the form **actually posts** instead: `new FormData(form).get('vendor')` (wrap in `expect.poll`). ## Round-trip a multi-row selection as `ids[]`, not as an EDN/filter snapshot A bulk modal acts on a *selection* of N entities (bulk-code: the checked transactions), the analog of a single modal's one `db/id`. The wizard stashed the whole search-params blob (filters + `selected` + `all-selected`) in the EDN snapshot and re-ran the filter query on every post. Don't carry that forward. Instead **resolve the selection to a concrete id vector once at open** (`selected->ids` → the not-locked set) and ride it back in hidden `ids[0..n]` fields; re-read it on each post (`[:vector {:coerce? true} entity-id]` + the `coerce-vector` transformer turns the `{"0" "123"}` index-map into `[123]`). No snapshot, no filter round-trip, and it's *more* correct — you code exactly the rows the user saw, immune to data changing between open and submit. This is heuristic 2 → 0 for a multi-select modal. ## No parity gate? Build one first — seed + characterization spec, before touching code A modal with **no e2e coverage** (and no test-server seed for its domain) cannot be migrated safely — "behavior parity is proven by tests, not by reading" is the skill's #1 non-negotiable. Phase 4 (POS Sales Summary) had zero coverage. The fix: (1) seed a representative entity in `test_server.clj`'s `seed-test-data` and surface its id via `/test-info`; (2) write a characterization spec against the **unmodified** modal and confirm it green; (3) commit the gate *separately, ahead of the rewrite*. Reach the modal the real way (grid → row's edit button), not a direct fragment URL. To discover the actual rendered structure (field names, ids, swap targets) — especially when the code has dead/buggy render fns — dump the live modal HTML with a throwaway spec first; assert against what *renders*, not what the code looks like. ## Characterize before you fix; never assert a bug as working Writing the gate often surfaces pre-existing bugs (Phase 4: a "New Summary Item" button that threw `newRowIndex is not defined`, and a totals display whose malformed Hiccup discarded its own labels). Do **not** assert the broken behavior as if it works, and do **not** silently "fix" it mid-refactor — surface it and let the user decide fix-vs-preserve. If they choose *fix*: the spec first documents the break (a passing test of the *current* inert behavior or an explicit note), then is rewritten to assert the *fixed* behavior as part of the migration commit. ## htmx `keyup`-triggered inputs need real keystrokes in tests A money/text input wired `hx-trigger="keyup changed delay:300ms"` does **not** fire on Playwright `.fill()` + `dispatchEvent('change')` — `fill` sets the value without keyup events. Use `.click()` then `.pressSequentially('500')` (types char-by-char, firing keyup) so the targeted swap actually triggers. (A `change`-triggered control is the opposite — `dispatchEvent('change')` is fine there.) ## clojure-mcp structural edits reformat the whole file — use text Edit in big shared files `clojure_edit` / `clojure_edit_replace_sexp` re-emit the **entire file** through the formatter. In a small single-modal file that's fine (cljfmt-clean output). In a **large multi-modal file** (Phase 5: `invoices.clj`, 1812 lines) a one-line require addition produced a **650-line spurious whitespace diff** that buries the real change and makes review impossible. For a surgical migration inside a big shared file, use the **text-based Edit tool** (exact-string match — no reformat); this is the AGENTS.md "edit Clojure with file tools only when absolutely necessary" carve-out. Verify with `load-file` (compile) + `lein cljfmt check`, not by eyeballing. Confirm the diff is contained with `git diff -U0 | grep '^@@'` — the hunks should cluster only where you edited (requires + the modal region), nothing else. ## Wiring a modal onto the wizard2 engine — use the engine's primitives, don't re-roll them Phase 6's first migration (Transaction Rule) hit three traps; an adversarial review pointed out the engine had the information to prevent all three, so **the engine now absorbs them**. A consumer is just a config map + the step `:render` fns — reach for these instead of re-implementing them (and re-hitting the bug): - **Nav fields are stripped for you.** `handle-step-submit` `dissoc`s its own `wizard-id`/`current-step`/`direction` from `:form-params` before calling a step's `:decode` (`wizard2.clj`), so your decode sees only real fields and they can't ride into the saved entity. (The old failure was a **500 on save** — `:db.error/not-an-entity :current-step` — because an open `:map` decode kept them. No allowlist needed anymore.) - **`wizard2/open-wizard` owns the modal wrap.** Give the config an `:open-response` fn (e.g. `(fn [form] (modal-response [:div#transitioner.flex-1 form]))`); then the new/edit routes are literally `(partial wizard2/open-wizard config)`. Don't hand-roll `create!/render/wrap/thread` — that boilerplate was duplicating engine internals. - **Add rows with `wizard2/blank-row`.** It supplies a temp `:db/id` (so a row schema requiring `[:db/id [:or entity-id temp-id]]` validates and the step actually advances — the old symptom was "the Next/Test button does nothing") plus `:new?` for the appear animation: `(wizard2/blank-row :foo/location "Shared")`. - **Footer with `wizard2/nav-footer`.** It emits the `direction` submit buttons (Back / primary advance / Save), marks the advance/save button `data-primary`, and the form's Enter guard (`wizard2/wizard-form`) triggers `data-primary` — so Enter and Back/Save aren't left to per-consumer convention. (Testing note that survives: Back and Save are *both* `type=submit`, so target a save button by its text, not `button[type=submit]`.) ## Scorecard exceptions (ratchet violations with a reason) **Heuristic 4 (LOC net ↓) — exception (Phase 3, Transaction Bulk Code: 420→506).** When the modal's wizard was a *thin* shell that delegated almost everything to `mm/*` defaults (`default-render-step`, `default-render-wizard`, `submit-handler`, `open-wizard-handler`), ripping the wizard out moves that previously-shared plumbing **into the file** as explicit render/decode/submit/handler code, so the single-file LOC rises even though total system complexity drops. This is the opposite of a fat wizard (edit went 1608→1548). The trade is intended and every other heuristic improved sharply (mm coupling 19→0, snapshot merges 4→0, wizard records 3→0, routes 4→3, `find *`→explicit-id swap). Watch for it on the small "single-step wearing a wizard costume" modals — LOC is the wrong headline metric there; the mm-coupling / snapshot / route counts are. **Heuristic 9 (Hiccup in render path) — partial exception (Phase 2-final).** The post-save `com/success-modal` confirmation dialogs in `save-handler` keep ~6 `[:p …]` Hiccup lines. They are terminal responses (shown after the form closes), reuse a shared dialog component, and sit outside the form's interactive render path. Migrating them means porting the shared `success-modal` to Selmer — a Phase 11 cross-cutting task, not a single-modal one. ## Keep wizard session data EDN-safe (the cookie store has no custom readers) The session-backed engine stores per-step data + context in the Ring session, and this app's session store is a **cookie-store** (`ring.middleware.session.cookie`) that serializes with `pr-str` and reads back with plain `clojure.edn/read-string` — **no custom tag readers**. So anything you put in a wizard's `:context` or that a step `:decode` returns (which `put-step` persists) must round-trip through bare EDN. A `clj-time` `DateTime` does not: it `pr-str`s as `#clj-time/date-time "…"` and the read side 500s with **"No reader function for tag clj-time/date-time"** on the *next* request that reads the cookie. This first bit Invoice Pay (Phase 7), whose context defaulted `:handwritten-date (time/now)`. Rules of thumb: - **Context**: store only EDN-safe primitives (numbers, strings, keywords, vectors, maps, `#inst`/`java.util.Date`). Compute clj-time defaults in the *render* fn, not in context. - **Step data**: a `clj-time` value decoded by a step is fine *in memory* on the terminal (`:done`) path — `get-all` reads it before `forget` clears the wizard, so it never reaches the cookie. It only bites if a clj-time value survives in a step that gets re-persisted (a non-terminal `put-step`). When in doubt, decode dates to `#inst` or keep them as strings until the done-fn. - The old `mm` wizard dodged this because it read its EDN snapshot with `clojure.edn/read-string {:readers clj-time.coerce/data-readers}` (see `multi_modal.clj`) — the cookie store has no such readers. (A durable/typed session backend would remove this constraint; until then, EDN-safe is the rule. See `form-vs-wizard.md` open question.) ## A bare `[:map …]` query-schema 500s on empty query-params (the `{}`→nil trap) `auto-ap.ssr.utils/main-transformer` includes `parse-empty-as-nil`, whose **`:map` decoder turns any map with no truthy values into `nil`** (`(if (seq (filter identity (vals m))) m nil)`). So `(mc/coerce [:map [:k {:optional true} …]] {} main-transformer)` decodes `{}` → `nil`, then validates `nil` against `[:map …]` → `:malli.core/invalid-type` → **500**. Ring's `wrap-params` sets `:query-params` to `{}` (not nil) for a request with no query string. So **any handler wrapped with `wrap-schema-enforce :query-schema [:map …]` 500s on a PUT/POST that carries no `?query`** — `(and query-schema query-params)` is truthy for `{}`, so the coercion runs and blows up. This is exactly why the pre-migration New Invoice basic-details "Save" was broken: its button `hx-put`s `/invoice/new/navigate` (no `?to`), and `mm/next-handler`'s `[:to {:optional true} …]` query-schema 500d every time (the `CustomNext`/308-to-submit logic never even ran). - A `[:maybe [:map …]]` query-schema survives (`nil` is valid) — that's why the *grid* query-schema, hit by the same empty POST, doesn't throw. - **The engine sidesteps this entirely**: `handle-step-submit` is a POST with **no** query-schema, so empty query-params never reach a `[:map]` coercion. Migrating a wizard off the `mm` navigate route *removes* the bug; you don't need to fix the old route. ## Keep wizard dates as `#inst`, not clj-time, in step-data Reinforcing the EDN-safety rule above: a new+edit wizard that stores dates across a non-terminal step (New Invoice: `basic-details` holds `:invoice/date` while you visit `accounts`) must keep them **EDN-safe**. Decode them to `java.util.Date` (`coerce/to-date`) before they land in step-data, and coerce back to clj-time only for display (`coerce/from-date` → `atime/unparse-local`). A helper that maps over the date keys (`->edn-safe-dates`) right after `mc/decode` is the clean seam — both the step `:decode` and the edit `:init-fn` run the posted/persisted map through it. Datomic's upsert wants `java.util.Date` anyway, so the done-fn needs no extra conversion. ## The `{}`→nil trap has a THIRD face: empty-step decode → validation "invalid type" Beyond query-params (Phase 8) and route-params (Phase 9's `/navigat`), the same `parse-empty-as-nil` `:map` decoder bites a wizard step whose fields are all blank: an all-empty step posts only blank inputs → the decoded all-nil map collapses to `nil`. If that `nil` then flows into a `:validate` that does `(mc/validate step-schema data)`, validation fails with `[invalid type]` (nil isn't a map) and the step can never advance — even though every field is optional. The legal/address steps (all-optional) hit this. Fix at the seam: have the step `:decode` coerce nil back to `{}`: ```clojure (defn- decode-with [schema request] (or (mc/decode schema (... nested form-params ...) main-transformer) {})) ``` Now an optional-only step validates `{}` (passes, advances) while a required-field step (e.g. account needs `:vendor/default-account`) still fails on the *missing key*, not on a spurious nil. Don't "fix" it by skipping validation when data is nil — that lets a genuinely empty required step through. ## A new (db/id-less) nested entity with all-nil fields → datomic "tempid used only as value" The empty Address step decodes to `{:vendor/address {:address/street1 nil, …}}` — a map of nils with no `:db/id`. `:upsert-entity` mints a tempid for that nested map but, since every attribute is nil, the address entity has nothing transacted, so the tempid is referenced as a ref value but never defined → `:db.error/tempid-not-an-entity … used only as value`. Drop such blank nested maps before the upsert: ```clojure (defn- blank-address? [a] (and (map? a) (not (:db/id a)) (every? nil? (vals a)))) ``` This is the nested-entity analogue of "don't create empty rows"; the engine's `blank-row` gives *added* rows a tempid, but a never-touched optional nested entity must be elided.