integreat/.claude/skills/ssr-form-migration/reference/gotchas.md

# Gotchas

GROWS every migration. One entry per surprise. Also the home for any **written exception**
to the scorecard ratchet (a metric that regressed for a documented reason).

---

## Stale `$refs` / `tippy` after a swap

A whole-form swap can run an Alpine event handler *before* the component re-initialises,
so a handler that dereferences `$refs.input.__x_tippy` or calls `tippy.show()` throws.
**Always null-guard:** `$refs.input?.__x_tippy?.hide()`, `tippy?.show()`. The
`transaction-edit-swap.spec.ts` `trackErrors()` helper fails the test on any `pageerror`
or `console.error`, which is exactly how a stale-ref throw surfaces.

## Let the server value win — don't preserve Alpine state across a server-driven change

When a server change should update a component (e.g. choosing a vendor sets its default
account), rebuild that section fresh on the swap so the server-provided value lands
without keying tricks. The bug this prevents: "changing the vendor a *second* time doesn't
update the account" because preserved Alpine state shadowed the new server value. If you
*must* preserve a component, key it by value so a change forces re-init:
`(assoc attrs :key (str id "--" current-value))`.

## Focus dies if the typed input is inside its own swapped region

The single most important invariant. Amount field → swap a sibling tbody, not the row.
Memo → swap nothing. If a caret test (`sameNode`) fails, the input is in its own swap
region — re-target to a sibling/ancestor that excludes it.

## Faked cursors breed duplicate render fns

A `with-cursor`/`MapCursor` re-root to fake a deep start forces a `*-no-cursor*` twin.
Removing the fake lets you delete the twin. Don't "fix" a faked cursor in place — top-root
it and collapse to one render fn. (See `render-functions.md`.)

## Edit Clojure with clojure-mcp tools, not the file editor

`clojure_edit` / `clojure_edit_replace_sexp`. If a file won't compile: `clj-paren-repair`
the file, then retry; if still broken, `lein cljfmt check`. Run tests via `clojure-eval` /
`clj-nrepl-eval -p PORT`, never `lein test` (slow, last resort).

## Solr/typeahead in tests

Account/vendor search is backed by Solr, unavailable in tests. To drive a typeahead in
e2e: type under the 3-char threshold, then inject a result into Alpine state
(`Alpine.$data(el).elements = [{value, label}]`) and click it — the real click handler,
`tippy.hide()`, Alpine reactivity, and the HTMX swap all run as in production. Entity ids
come from `GET /test-info`.

---

## UI-only control fields must be stripped before a Datomic upsert

The wizard snapshot/step-params carry UI control fields that are **not** schema
attributes — `:action`, `:amount-mode`, and (added by the simple/advanced work) `:mode`.
The `:manual` save handler stripped `:action`/`:amount-mode` but not `:mode`, so every
*advanced* manual save passed `:mode "advanced"` into `:upsert-transaction` and 500'd with
`:db.error/not-an-entity :mode`. Lesson: when a save derives its tx-data from the form
snapshot, **strip every non-schema control key** before transacting. The session-backed
wizard engine (Phase 6) avoids this class of bug by storing per-step *validated* data
only — UI control fields never enter the combined data. This was a real production bug
surfaced by the e2e gate, not a test artifact.

## E2E helpers must use the Alpine **v3** API, not the v2 `__x` internal

The app loads Alpine v3 (`cdn.jsdelivr.net/npm/alpinejs@3.x.x`). The v2 internal
`el.__x.$data` is **gone** — `el.__x` is `undefined`, so any helper that pokes it silently
no-ops. A stale `selectAccountFromTypeahead` did this and left the posted account empty
(account-controlled by `x-model`, so the raw DOM `.value` you set is overwritten from
Alpine's empty state). Drive components the real way instead: `window.Alpine.$data(el)`,
open the tippy dropdown, inject `elements`, click the result — exactly as
`transaction-edit-swap.spec.ts` does. Probe with
`{ hasLegacy__x: !!el.__x, hasAlpineData: !!window.Alpine.$data(el) }`.

## Diagnosing a "modal won't close after save"

The edit modal closes on an `hx-trigger: modalclose` from a *successful* save; a
validation failure re-renders the `#wizard-form` (200), and a server exception returns 500
(caught by `wrap-error`). To find which: capture POST responses in Playwright
(`page.on('response', …)`), read the `edit-submit` body — a `<form id="wizard-form">` means
validation re-render; a `#error {…}` stack means a 500. Then serialize the form right
before save (`new FormData(document.querySelector('#wizard-form'))`) to see exactly what
posts. This is how the `:mode` 500 and the empty-account bugs above were isolated.

## De-faking a cursor is not a drop-in — `with-field-default` mutates

Tempting fix for a faked deep cursor (`with-cursor` + synthetic `MapCursor` at index 0):
replace it with `(fc/with-field-default 0 {})` to advance naturally. **It broke the
simple-mode swap** (`transaction-edit-swap` test 1 threw). `with-field-default` calls
`cursor/transact!` — it *mutates the form cursor* (assoc-ing the default row) as a render
side effect, which changes simple-mode behavior. The read-only synthetic `MapCursor` did
not. Lesson: removing a faked cursor on these modals is **not** a one-liner — it's part of
the larger render-fn extraction (render the row from explicit data, construct field names
directly, look up errors explicitly), done when the simple/advanced rows are reworked into
pure render fns / Selmer. Don't swap one cursor primitive for another and assume parity;
verify against the swap spec, and expect the de-fake to come with the render-fn rewrite.

## Snapshot operations read stale state and drop live form values (heuristic 2)

The whole-form operation handlers (`apply-new-account`, `apply-remove-account`,
`apply-toggle-amount-mode`) rebuild the account rows from the **decoded `:snapshot`** (the
hidden EDN field), not from the live posted `:step-params`. So any value the user has typed
but that hasn't been re-serialised into the snapshot yet — e.g. an amount typed right
before clicking "New account" — is **silently lost** when the operation re-renders. This is
the snapshot round-trip fragility the migration removes (heuristic 2: → 0 merges; state
should ride in the form, not a parallel snapshot). It bit the percentage-split e2e: typing
50% then adding a row reverted the first row to its snapshot value, yielding a 66.67/33.33
split. Two ways it shows up and how to handle until the snapshot is gone:

**Fixed (Stage 1):** the operation handlers read the live `:step-params` rows (already
schema-decoded by `mm/wrap-wizard`) so typed values survive add/remove/toggle.

**Done (Stage 2 — the snapshot round-trip is gone).** The EDN `snapshot` hidden field +
custom readers + `merge-multi-form-state` are removed. A `db/id` hidden rides in the form;
`wrap-derive-state` rebuilds `:multi-form-state` per request from `entity ∪ step-params`,
and `EditWizard.render-wizard` renders a plain form (no snapshot/edit-path/current-step
hiddens). The ~34 `:snapshot` reads still work — `:snapshot` is now a derived map, not a
round-tripped blob.

**Trap that cost hours — derive `entity ∪ step-params` correctly.** First cut was
`(merge base step-params)`. Bug: `base` always carries the entity's *persisted* accounts,
so after the user removes every row (step-params has no accounts key) the merge falls back
to base → the persisted accounts **resurrect** on the next operation. Fix: editable fields
(accounts, vendor, memo, approval, action, mode, amount-mode) come **only** from the live
form (absent = cleared); only entity-only fields (`db/id`, client, amount, description,
status, type) come from the entity. Lesson: with a posted form, "field absent" means
*cleared*, not "use the persisted value" — never merge the entity's editable fields back in.

**Verify the snapshot removal on a FRESH server, and don't trust a long-lived in-process
test server.** Protocol/defrecord (`EditWizard.render-wizard`) and middleware reloads do
**not** fully take in a running REPL — the server kept rendering the old snapshot field
after `:reload`, and an in-process server that isn't reseeded between `npx playwright`
invocations accumulates state that makes order-dependent tests flake. Both produced hours
of phantom failures. Restart the REPL clean (or reseed) before trusting an e2e result; CI
boots a fresh server per run, so the fresh-server number (38 pass / 1 unrelated) is the real one.

## Characterization tests rot against table order and removed wizard chrome

Two stale-test traps surfaced once the masking failure was fixed (a `mode: 'serial'` file
hides every test after the first failure, so fixing one unmasks the next):

- **Hard-coded amounts per table row index** (`openEditModal(page, 3)` then
  `expect(amount).toBeCloseTo(400)`) break because same-date seed transactions have no
  pinned row order. Read the actual value (e.g. the grid's `.account-grand-total-row`)
  instead of hard-coding.
- **Helpers that navigate the old multi-step wizard** (`click('button:has-text("Transaction
  Actions")')`) hang once the modal is single-page. Drop the navigation; the action tabs
  are present immediately.

## Flat decode leaks stray form fields into the saved entity (the `method` 500)

Dropping the wizard's `step-params[...]` field-name prefix and decoding posted params
**straight into the form schema** means the decode now captures **every** posted field, not
just the namespaced ones. A single stray field breaks the save:

- The tab switcher is `(com/button-group {:name "method"} …)`, which emits
  `<input type="hidden" name="method">`. Under the wizard, `method` lived *outside*
  `step-params[...]` so it never entered the decoded map. After the rename it decodes to
  `:method ""` (malli `:map` is open and passes unknown keys), rides into `snapshot` →
  `tx-data`, and `:upsert-transaction` rejects it → **HTTP 500 on save**.
- Symptom: the save POST fires (confirm with a `println` in the submit handler) but the
  modal never closes, because the 500 trips `htmx:response-error`. The server error may go
  to mulog, not stdout — an empty stdout log does **not** mean "no error." Reproduce the
  exact POST with `curl` (add/remove one field) to isolate the offender fast.

**Fix:** strip the decoded map to the schema's known top-level keys before threading on
(`select-keys decoded edit-form-keys`); keep that allowlist next to the schema. Nested
account sub-maps decode fine — only the top level needs the guard.

## REPL reload does not refresh a running jetty's routes — restart the JVM

`handler/match->handler-lookup` is a top-level `def` capturing `(merge ssr/key->handler …)`
at load, through a chain of module-level `def`s (`edit` → `ssr.transaction` → `ssr.core` →
`handler`). Reloading the leaf `edit.clj` updates it but **not** the captured merges, and a
jetty started `(run-jetty app …)` holds a static `app` that doesn't re-deref the lookup per
request. Net: after a handler/route/record change, an already-running dev server keeps
serving the **old** code — `curl` shows the pre-change response (e.g. the old wizard
transitioner) while your REPL renders the new one. **Restart in a fresh JVM** for
route/record/middleware changes. For e2e, the Playwright test server
(`lein run -m auto-ap.test-server`) is a fresh JVM compiling from disk — but kill any stale
`:3333` first (`reuseExistingServer` reuses it), and kill **by port**
(`ss -tlnp | grep :3333`), never `pkill -f test-server` (it matches its own command line).

## Full-suite e2e flakes are shared-seed interference

The test server seeds once at boot; edit tests **save** (mutate) those seed transactions.
Run in parallel, workers race the same rows and earlier saves pollute later reads → phantom
failures that pass in isolation.

**Proper fix (landed on `staging`, adopted at the rebase):** a `/test-reset` endpoint
(`test_server.clj` → `reset-test-data!` recreates + re-seeds the in-memory db) called from a
`test.beforeEach` in each spec, plus `fullyParallel: false` + `workers: 1` in
`playwright.config.ts`. Every test starts from the same deterministic dataset regardless of
run order. This **supersedes** the earlier `--workers=1`-only workaround (which kept order
dependence; it merely serialized the races instead of eliminating cross-test state).
Post-adoption baseline is **39 pass / 0 fail** — the previously-flaky
`transaction-navigation.spec.ts` date-range test is now green, because `/test-reset` removes
the residual mutation it was tripping over.

## A value-bound typeahead hidden goes stale across a whole-form swap unless keyed

A typeahead (`sc/typeahead`) posts its value through a hidden `<input :value="value.value">`
whose DOM `.value` is set by Alpine, not by the server-rendered static `value` attr. After a
**whole-form `outerHTML` swap** that re-renders the typeahead, Alpine may preserve the *previous*
component's empty `.value` instead of binding the new server value — so the field posts blank
on the next submit. Fix: pass **`:id`** to `sc/typeahead` (the account typeahead already does).
`:id` makes the wrapper emit `:key (str id "--" value)`, and the value-keyed `:key` forces a
clean Alpine re-init that lands the server value. The bulk-code *vendor* typeahead hit this
(account rows didn't, because they pass `:id`) — symptom: "vendor not preserved on a validation
re-render." Note the testing trap: reading the hidden's `.value` in isolation
(`inputValue()` / `toHaveValue`) is an unreliable probe — it lags Alpine. Assert what the form
**actually posts** instead: `new FormData(form).get('vendor')` (wrap in `expect.poll`).

## Round-trip a multi-row selection as `ids[]`, not as an EDN/filter snapshot

A bulk modal acts on a *selection* of N entities (bulk-code: the checked transactions), the
analog of a single modal's one `db/id`. The wizard stashed the whole search-params blob (filters
+ `selected` + `all-selected`) in the EDN snapshot and re-ran the filter query on every post.
Don't carry that forward. Instead **resolve the selection to a concrete id vector once at open**
(`selected->ids` → the not-locked set) and ride it back in hidden `ids[0..n]` fields; re-read it
on each post (`[:vector {:coerce? true} entity-id]` + the `coerce-vector` transformer turns the
`{"0" "123"}` index-map into `[123]`). No snapshot, no filter round-trip, and it's *more* correct
— you code exactly the rows the user saw, immune to data changing between open and submit. This
is heuristic 2 → 0 for a multi-select modal.

## No parity gate? Build one first — seed + characterization spec, before touching code

A modal with **no e2e coverage** (and no test-server seed for its domain) cannot be migrated
safely — "behavior parity is proven by tests, not by reading" is the skill's #1 non-negotiable.
Phase 4 (POS Sales Summary) had zero coverage. The fix: (1) seed a representative entity in
`test_server.clj`'s `seed-test-data` and surface its id via `/test-info`; (2) write a
characterization spec against the **unmodified** modal and confirm it green; (3) commit the gate
*separately, ahead of the rewrite*. Reach the modal the real way (grid → row's edit button), not
a direct fragment URL. To discover the actual rendered structure (field names, ids, swap targets)
— especially when the code has dead/buggy render fns — dump the live modal HTML with a throwaway
spec first; assert against what *renders*, not what the code looks like.

## Characterize before you fix; never assert a bug as working

Writing the gate often surfaces pre-existing bugs (Phase 4: a "New Summary Item" button that
threw `newRowIndex is not defined`, and a totals display whose malformed Hiccup discarded its
own labels). Do **not** assert the broken behavior as if it works, and do **not** silently "fix"
it mid-refactor — surface it and let the user decide fix-vs-preserve. If they choose *fix*: the
spec first documents the break (a passing test of the *current* inert behavior or an explicit
note), then is rewritten to assert the *fixed* behavior as part of the migration commit.

## htmx `keyup`-triggered inputs need real keystrokes in tests

A money/text input wired `hx-trigger="keyup changed delay:300ms"` does **not** fire on Playwright
`.fill()` + `dispatchEvent('change')` — `fill` sets the value without keyup events. Use
`.click()` then `.pressSequentially('500')` (types char-by-char, firing keyup) so the targeted
swap actually triggers. (A `change`-triggered control is the opposite — `dispatchEvent('change')`
is fine there.)

## clojure-mcp structural edits reformat the whole file — use text Edit in big shared files

`clojure_edit` / `clojure_edit_replace_sexp` re-emit the **entire file** through the formatter.
In a small single-modal file that's fine (cljfmt-clean output). In a **large multi-modal file**
(Phase 5: `invoices.clj`, 1812 lines) a one-line require addition produced a **650-line spurious
whitespace diff** that buries the real change and makes review impossible. For a surgical
migration inside a big shared file, use the **text-based Edit tool** (exact-string match — no
reformat); this is the AGENTS.md "edit Clojure with file tools only when absolutely necessary"
carve-out. Verify with `load-file` (compile) + `lein cljfmt check`, not by eyeballing. Confirm the
diff is contained with `git diff -U0 <file> | grep '^@@'` — the hunks should cluster only where you
edited (requires + the modal region), nothing else.

## Wiring a modal onto the wizard2 engine — three traps that cost a debug cycle each

1. **Strip the engine's nav fields in the step `:decode`.** The posted form carries
   `wizard-id` / `current-step` / `direction` alongside the real fields. If the step schema is
   an open `:map` (most are), `mc/decode` keeps them, they ride into `get-all`, and the save's
   `:upsert-entity` dies with `:db.error/not-an-entity ... :current-step`. Fix: `select-keys`
   the decode to the schema's known top-level keys (the same allowlist trick as the flat-form
   migrations). Symptom is a **500 on save**, not a validation message.
2. **New repeated-row needs a temp `:db/id` or the step can't advance.** If the row schema
   requires `[:db/id [:or entity-id temp-id]]`, an added row with no id fails per-step
   validation, so the engine re-renders the *same* step instead of advancing — looks like "the
   Next/Test button does nothing." Give new rows `(str (java.util.UUID/randomUUID))`.
3. **Nav is a `direction` field, and Back/Save are both submit buttons.** The footer buttons
   are plain `<button type="submit" name="direction" value="next|back|submit">`; the clicked
   one's value rides in the POST and the engine branches on it. In tests, a selector like
   `button:has-text("Save"), button[type=submit]` also matches **Back** (also a submit) and
   `.first()` clicks Back — target the button by its text/value precisely.

## Scorecard exceptions (ratchet violations with a reason)

**Heuristic 4 (LOC net ↓) — exception (Phase 3, Transaction Bulk Code: 420→506).** When the
modal's wizard was a *thin* shell that delegated almost everything to `mm/*` defaults
(`default-render-step`, `default-render-wizard`, `submit-handler`, `open-wizard-handler`),
ripping the wizard out moves that previously-shared plumbing **into the file** as explicit
render/decode/submit/handler code, so the single-file LOC rises even though total system
complexity drops. This is the opposite of a fat wizard (edit went 1608→1548). The trade is
intended and every other heuristic improved sharply (mm coupling 19→0, snapshot merges 4→0,
wizard records 3→0, routes 4→3, `find *`→explicit-id swap). Watch for it on the small
"single-step wearing a wizard costume" modals — LOC is the wrong headline metric there;
the mm-coupling / snapshot / route counts are.

**Heuristic 9 (Hiccup in render path) — partial exception (Phase 2-final).** The post-save
`com/success-modal` confirmation dialogs in `save-handler` keep ~6 `[:p …]` Hiccup lines.
They are terminal responses (shown after the form closes), reuse a shared dialog component,
and sit outside the form's interactive render path. Migrating them means porting the shared
`success-modal` to Selmer — a Phase 11 cross-cutting task, not a single-modal one.