# Gotchas

GROWS every migration. One entry per surprise. Also the home for any **written exception**
to the scorecard ratchet (a metric that regressed for a documented reason).

---

## Stale `$refs` / `tippy` after a swap

A whole-form swap can run an Alpine event handler *before* the component re-initialises,
so a handler that dereferences `$refs.input.__x_tippy` or calls `tippy.show()` throws.
**Always null-guard:** `$refs.input?.__x_tippy?.hide()`, `tippy?.show()`. The
`transaction-edit-swap.spec.ts` `trackErrors()` helper fails the test on any `pageerror`
or `console.error`, which is exactly how a stale-ref throw surfaces.

## Let the server value win — don't preserve Alpine state across a server-driven change

When a server change should update a component (e.g. choosing a vendor sets its default
account), rebuild that section fresh on the swap so the server-provided value lands
without keying tricks. The bug this prevents: "changing the vendor a *second* time doesn't
update the account" because preserved Alpine state shadowed the new server value. If you
*must* preserve a component, key it by value so a change forces re-init:
`(assoc attrs :key (str id "--" current-value))`.

## Focus dies if the typed input is inside its own swapped region

The single most important invariant. Amount field → swap a sibling tbody, not the row.
Memo → swap nothing. If a caret test (`sameNode`) fails, the input is in its own swap
region — re-target to a sibling/ancestor that excludes it.

## Faked cursors breed duplicate render fns

A `with-cursor`/`MapCursor` re-root to fake a deep start forces a `*-no-cursor*` twin.
Removing the fake lets you delete the twin. Don't "fix" a faked cursor in place — top-root
it and collapse to one render fn. (See `render-functions.md`.)

## Edit Clojure with clojure-mcp tools, not the file editor

`clojure_edit` / `clojure_edit_replace_sexp`. If a file won't compile: `clj-paren-repair`
the file, then retry; if still broken, `lein cljfmt check`. Run tests via `clojure-eval` /
`clj-nrepl-eval -p PORT`, never `lein test` (slow, last resort).

## Solr/typeahead in tests

Account/vendor search is backed by Solr, unavailable in tests. To drive a typeahead in
e2e: type under the 3-char threshold, then inject a result into Alpine state
(`Alpine.$data(el).elements = [{value, label}]`) and click it — the real click handler,
`tippy.hide()`, Alpine reactivity, and the HTMX swap all run as in production. Entity ids
come from `GET /test-info`.

---

## UI-only control fields must be stripped before a Datomic upsert

The wizard snapshot/step-params carry UI control fields that are **not** schema
attributes — `:action`, `:amount-mode`, and (added by the simple/advanced work) `:mode`.
The `:manual` save handler stripped `:action`/`:amount-mode` but not `:mode`, so every
*advanced* manual save passed `:mode "advanced"` into `:upsert-transaction` and 500'd with
`:db.error/not-an-entity :mode`. Lesson: when a save derives its tx-data from the form
snapshot, **strip every non-schema control key** before transacting. The session-backed
wizard engine (Phase 6) avoids this class of bug by storing per-step *validated* data
only — UI control fields never enter the combined data. This was a real production bug
surfaced by the e2e gate, not a test artifact.

## E2E helpers must use the Alpine **v3** API, not the v2 `__x` internal

The app loads Alpine v3 (`cdn.jsdelivr.net/npm/alpinejs@3.x.x`). The v2 internal
`el.__x.$data` is **gone** — `el.__x` is `undefined`, so any helper that pokes it silently
no-ops. A stale `selectAccountFromTypeahead` did this and left the posted account empty
(account-controlled by `x-model`, so the raw DOM `.value` you set is overwritten from
Alpine's empty state). Drive components the real way instead: `window.Alpine.$data(el)`,
open the tippy dropdown, inject `elements`, click the result — exactly as
`transaction-edit-swap.spec.ts` does. Probe with
`{ hasLegacy__x: !!el.__x, hasAlpineData: !!window.Alpine.$data(el) }`.

## Diagnosing a "modal won't close after save"

The edit modal closes on an `hx-trigger: modalclose` from a *successful* save; a
validation failure re-renders the `#wizard-form` (200), and a server exception returns 500
(caught by `wrap-error`). To find which: capture POST responses in Playwright
(`page.on('response', …)`), read the `edit-submit` body — a `<form id="wizard-form">` means
validation re-render; a `#error {…}` stack means a 500. Then serialize the form right
before save (`new FormData(document.querySelector('#wizard-form'))`) to see exactly what
posts. This is how the `:mode` 500 and the empty-account bugs above were isolated.

## De-faking a cursor is not a drop-in — `with-field-default` mutates

Tempting fix for a faked deep cursor (`with-cursor` + synthetic `MapCursor` at index 0):
replace it with `(fc/with-field-default 0 {})` to advance naturally. **It broke the
simple-mode swap** (`transaction-edit-swap` test 1 threw). `with-field-default` calls
`cursor/transact!` — it *mutates the form cursor* (assoc-ing the default row) as a render
side effect, which changes simple-mode behavior. The read-only synthetic `MapCursor` did
not. Lesson: removing a faked cursor on these modals is **not** a one-liner — it's part of
the larger render-fn extraction (render the row from explicit data, construct field names
directly, look up errors explicitly), done when the simple/advanced rows are reworked into
pure render fns / Selmer. Don't swap one cursor primitive for another and assume parity;
verify against the swap spec, and expect the de-fake to come with the render-fn rewrite.

## Snapshot operations read stale state and drop live form values (heuristic 2)

The whole-form operation handlers (`apply-new-account`, `apply-remove-account`,
`apply-toggle-amount-mode`) rebuild the account rows from the **decoded `:snapshot`** (the
hidden EDN field), not from the live posted `:step-params`. So any value the user has typed
but that hasn't been re-serialised into the snapshot yet — e.g. an amount typed right
before clicking "New account" — is **silently lost** when the operation re-renders. This is
the snapshot round-trip fragility the migration removes (heuristic 2: → 0 merges; state
should ride in the form, not a parallel snapshot). It bit the percentage-split e2e: typing
50% then adding a row reverted the first row to its snapshot value, yielding a 66.67/33.33
split. Two ways it shows up and how to handle until the snapshot is gone:

**Fixed (Stage 1):** the operation handlers read the live `:step-params` rows (already
schema-decoded by `mm/wrap-wizard`) so typed values survive add/remove/toggle.

**Done (Stage 2 — the snapshot round-trip is gone).** The EDN `snapshot` hidden field +
custom readers + `merge-multi-form-state` are removed. A `db/id` hidden rides in the form;
`wrap-derive-state` rebuilds `:multi-form-state` per request from `entity ∪ step-params`,
and `EditWizard.render-wizard` renders a plain form (no snapshot/edit-path/current-step
hiddens). The ~34 `:snapshot` reads still work — `:snapshot` is now a derived map, not a
round-tripped blob.

**Trap that cost hours — derive `entity ∪ step-params` correctly.** First cut was
`(merge base step-params)`. Bug: `base` always carries the entity's *persisted* accounts,
so after the user removes every row (step-params has no accounts key) the merge falls back
to base → the persisted accounts **resurrect** on the next operation. Fix: editable fields
(accounts, vendor, memo, approval, action, mode, amount-mode) come **only** from the live
form (absent = cleared); only entity-only fields (`db/id`, client, amount, description,
status, type) come from the entity. Lesson: with a posted form, "field absent" means
*cleared*, not "use the persisted value" — never merge the entity's editable fields back in.

**Verify the snapshot removal on a FRESH server, and don't trust a long-lived in-process
test server.** Protocol/defrecord (`EditWizard.render-wizard`) and middleware reloads do
**not** fully take in a running REPL — the server kept rendering the old snapshot field
after `:reload`, and an in-process server that isn't reseeded between `npx playwright`
invocations accumulates state that makes order-dependent tests flake. Both produced hours
of phantom failures. Restart the REPL clean (or reseed) before trusting an e2e result; CI
boots a fresh server per run, so the fresh-server number (38 pass / 1 unrelated) is the real one.

## Characterization tests rot against table order and removed wizard chrome

Two stale-test traps surfaced once the masking failure was fixed (a `mode: 'serial'` file
hides every test after the first failure, so fixing one unmasks the next):

- **Hard-coded amounts per table row index** (`openEditModal(page, 3)` then
  `expect(amount).toBeCloseTo(400)`) break because same-date seed transactions have no
  pinned row order. Read the actual value (e.g. the grid's `.account-grand-total-row`)
  instead of hard-coding.
- **Helpers that navigate the old multi-step wizard** (`click('button:has-text("Transaction
  Actions")')`) hang once the modal is single-page. Drop the navigation; the action tabs
  are present immediately.

## Flat decode leaks stray form fields into the saved entity (the `method` 500)

Dropping the wizard's `step-params[...]` field-name prefix and decoding posted params
**straight into the form schema** means the decode now captures **every** posted field, not
just the namespaced ones. A single stray field breaks the save:

- The tab switcher is `(com/button-group {:name "method"} …)`, which emits
  `<input type="hidden" name="method">`. Under the wizard, `method` lived *outside*
  `step-params[...]` so it never entered the decoded map. After the rename it decodes to
  `:method ""` (malli `:map` is open and passes unknown keys), rides into `snapshot` →
  `tx-data`, and `:upsert-transaction` rejects it → **HTTP 500 on save**.
- Symptom: the save POST fires (confirm with a `println` in the submit handler) but the
  modal never closes, because the 500 trips `htmx:response-error`. The server error may go
  to mulog, not stdout — an empty stdout log does **not** mean "no error." Reproduce the
  exact POST with `curl` (add/remove one field) to isolate the offender fast.

**Fix:** strip the decoded map to the schema's known top-level keys before threading on
(`select-keys decoded edit-form-keys`); keep that allowlist next to the schema. Nested
account sub-maps decode fine — only the top level needs the guard.

## REPL reload does not refresh a running jetty's routes — restart the JVM

`handler/match->handler-lookup` is a top-level `def` capturing `(merge ssr/key->handler …)`
at load, through a chain of module-level `def`s (`edit` → `ssr.transaction` → `ssr.core` →
`handler`). Reloading the leaf `edit.clj` updates it but **not** the captured merges, and a
jetty started `(run-jetty app …)` holds a static `app` that doesn't re-deref the lookup per
request. Net: after a handler/route/record change, an already-running dev server keeps
serving the **old** code — `curl` shows the pre-change response (e.g. the old wizard
transitioner) while your REPL renders the new one. **Restart in a fresh JVM** for
route/record/middleware changes. For e2e, the Playwright test server
(`lein run -m auto-ap.test-server`) is a fresh JVM compiling from disk — but kill any stale
`:3333` first (`reuseExistingServer` reuses it), and kill **by port**
(`ss -tlnp | grep :3333`), never `pkill -f test-server` (it matches its own command line).

## Full-suite e2e flakes are shared-seed interference

The test server seeds once at boot; edit tests **save** (mutate) those seed transactions.
Run in parallel, workers race the same rows and earlier saves pollute later reads → phantom
failures that pass in isolation.

**Proper fix (landed on `staging`, adopted at the rebase):** a `/test-reset` endpoint
(`test_server.clj` → `reset-test-data!` recreates + re-seeds the in-memory db) called from a
`test.beforeEach` in each spec, plus `fullyParallel: false` + `workers: 1` in
`playwright.config.ts`. Every test starts from the same deterministic dataset regardless of
run order. This **supersedes** the earlier `--workers=1`-only workaround (which kept order
dependence; it merely serialized the races instead of eliminating cross-test state).
Post-adoption baseline is **39 pass / 0 fail** — the previously-flaky
`transaction-navigation.spec.ts` date-range test is now green, because `/test-reset` removes
the residual mutation it was tripping over.

## A value-bound typeahead hidden goes stale across a whole-form swap unless keyed

A typeahead (`sc/typeahead`) posts its value through a hidden `<input :value="value.value">`
whose DOM `.value` is set by Alpine, not by the server-rendered static `value` attr. After a
**whole-form `outerHTML` swap** that re-renders the typeahead, Alpine may preserve the *previous*
component's empty `.value` instead of binding the new server value — so the field posts blank
on the next submit. Fix: pass **`:id`** to `sc/typeahead` (the account typeahead already does).
`:id` makes the wrapper emit `:key (str id "--" value)`, and the value-keyed `:key` forces a
clean Alpine re-init that lands the server value. The bulk-code *vendor* typeahead hit this
(account rows didn't, because they pass `:id`) — symptom: "vendor not preserved on a validation
re-render." Note the testing trap: reading the hidden's `.value` in isolation
(`inputValue()` / `toHaveValue`) is an unreliable probe — it lags Alpine. Assert what the form
**actually posts** instead: `new FormData(form).get('vendor')` (wrap in `expect.poll`).

## Round-trip a multi-row selection as `ids[]`, not as an EDN/filter snapshot

A bulk modal acts on a *selection* of N entities (bulk-code: the checked transactions), the
analog of a single modal's one `db/id`. The wizard stashed the whole search-params blob (filters
+ `selected` + `all-selected`) in the EDN snapshot and re-ran the filter query on every post.
Don't carry that forward. Instead **resolve the selection to a concrete id vector once at open**
(`selected->ids` → the not-locked set) and ride it back in hidden `ids[0..n]` fields; re-read it
on each post (`[:vector {:coerce? true} entity-id]` + the `coerce-vector` transformer turns the
`{"0" "123"}` index-map into `[123]`). No snapshot, no filter round-trip, and it's *more* correct
— you code exactly the rows the user saw, immune to data changing between open and submit. This
is heuristic 2 → 0 for a multi-select modal.

## No parity gate? Build one first — seed + characterization spec, before touching code

A modal with **no e2e coverage** (and no test-server seed for its domain) cannot be migrated
safely — "behavior parity is proven by tests, not by reading" is the skill's #1 non-negotiable.
Phase 4 (POS Sales Summary) had zero coverage. The fix: (1) seed a representative entity in
`test_server.clj`'s `seed-test-data` and surface its id via `/test-info`; (2) write a
characterization spec against the **unmodified** modal and confirm it green; (3) commit the gate
*separately, ahead of the rewrite*. Reach the modal the real way (grid → row's edit button), not
a direct fragment URL. To discover the actual rendered structure (field names, ids, swap targets)
— especially when the code has dead/buggy render fns — dump the live modal HTML with a throwaway
spec first; assert against what *renders*, not what the code looks like.

## Characterize before you fix; never assert a bug as working

Writing the gate often surfaces pre-existing bugs (Phase 4: a "New Summary Item" button that
threw `newRowIndex is not defined`, and a totals display whose malformed Hiccup discarded its
own labels). Do **not** assert the broken behavior as if it works, and do **not** silently "fix"
it mid-refactor — surface it and let the user decide fix-vs-preserve. If they choose *fix*: the
spec first documents the break (a passing test of the *current* inert behavior or an explicit
note), then is rewritten to assert the *fixed* behavior as part of the migration commit.

## htmx `keyup`-triggered inputs need real keystrokes in tests

A money/text input wired `hx-trigger="keyup changed delay:300ms"` does **not** fire on Playwright
`.fill()` + `dispatchEvent('change')` — `fill` sets the value without keyup events. Use
`.click()` then `.pressSequentially('500')` (types char-by-char, firing keyup) so the targeted
swap actually triggers. (A `change`-triggered control is the opposite — `dispatchEvent('change')`
is fine there.)

## clojure-mcp structural edits reformat the whole file — use text Edit in big shared files

`clojure_edit` / `clojure_edit_replace_sexp` re-emit the **entire file** through the formatter.
In a small single-modal file that's fine (cljfmt-clean output). In a **large multi-modal file**
(Phase 5: `invoices.clj`, 1812 lines) a one-line require addition produced a **650-line spurious
whitespace diff** that buries the real change and makes review impossible. For a surgical
migration inside a big shared file, use the **text-based Edit tool** (exact-string match — no
reformat); this is the AGENTS.md "edit Clojure with file tools only when absolutely necessary"
carve-out. Verify with `load-file` (compile) + `lein cljfmt check`, not by eyeballing. Confirm the
diff is contained with `git diff -U0 <file> | grep '^@@'` — the hunks should cluster only where you
edited (requires + the modal region), nothing else.

## Wiring a modal onto the wizard2 engine — use the engine's primitives, don't re-roll them

Phase 6's first migration (Transaction Rule) hit three traps; an adversarial review pointed
out the engine had the information to prevent all three, so **the engine now absorbs them**.
A consumer is just a config map + the step `:render` fns — reach for these instead of
re-implementing them (and re-hitting the bug):

- **Nav fields are stripped for you.** `handle-step-submit` `dissoc`s its own
  `wizard-id`/`current-step`/`direction` from `:form-params` before calling a step's
  `:decode` (`wizard2.clj`), so your decode sees only real fields and they can't ride into
  the saved entity. (The old failure was a **500 on save** — `:db.error/not-an-entity
  :current-step` — because an open `:map` decode kept them. No allowlist needed anymore.)
- **`wizard2/open-wizard` owns the modal wrap.** Give the config an `:open-response` fn
  (e.g. `(fn [form] (modal-response [:div#transitioner.flex-1 form]))`); then the
  new/edit routes are literally `(partial wizard2/open-wizard config)`. Don't hand-roll
  `create!/render/wrap/thread` — that boilerplate was duplicating engine internals.
- **Add rows with `wizard2/blank-row`.** It supplies a temp `:db/id` (so a row schema
  requiring `[:db/id [:or entity-id temp-id]]` validates and the step actually advances —
  the old symptom was "the Next/Test button does nothing") plus `:new?` for the appear
  animation: `(wizard2/blank-row :foo/location "Shared")`.
- **Footer with `wizard2/nav-footer`.** It emits the `direction` submit buttons (Back /
  primary advance / Save), marks the advance/save button `data-primary`, and the form's
  Enter guard (`wizard2/wizard-form`) triggers `data-primary` — so Enter and Back/Save
  aren't left to per-consumer convention. (Testing note that survives: Back and Save are
  *both* `type=submit`, so target a save button by its text, not `button[type=submit]`.)

## Scorecard exceptions (ratchet violations with a reason)

**Heuristic 4 (LOC net ↓) — exception (Phase 3, Transaction Bulk Code: 420→506).** When the
modal's wizard was a *thin* shell that delegated almost everything to `mm/*` defaults
(`default-render-step`, `default-render-wizard`, `submit-handler`, `open-wizard-handler`),
ripping the wizard out moves that previously-shared plumbing **into the file** as explicit
render/decode/submit/handler code, so the single-file LOC rises even though total system
complexity drops. This is the opposite of a fat wizard (edit went 1608→1548). The trade is
intended and every other heuristic improved sharply (mm coupling 19→0, snapshot merges 4→0,
wizard records 3→0, routes 4→3, `find *`→explicit-id swap). Watch for it on the small
"single-step wearing a wizard costume" modals — LOC is the wrong headline metric there;
the mm-coupling / snapshot / route counts are.

**Heuristic 9 (Hiccup in render path) — partial exception (Phase 2-final).** The post-save
`com/success-modal` confirmation dialogs in `save-handler` keep ~6 `[:p …]` Hiccup lines.
They are terminal responses (shown after the form closes), reuse a shared dialog component,
and sit outside the form's interactive render path. Migrating them means porting the shared
`success-modal` to Selmer — a Phase 11 cross-cutting task, not a single-modal one.

## Keep wizard session data EDN-safe (the cookie store has no custom readers)

The session-backed engine stores per-step data + context in the Ring session, and this app's
session store is a **cookie-store** (`ring.middleware.session.cookie`) that serializes with
`pr-str` and reads back with plain `clojure.edn/read-string` — **no custom tag readers**. So
anything you put in a wizard's `:context` or that a step `:decode` returns (which `put-step`
persists) must round-trip through bare EDN. A `clj-time` `DateTime` does not: it `pr-str`s as
`#clj-time/date-time "…"` and the read side 500s with **"No reader function for tag
clj-time/date-time"** on the *next* request that reads the cookie.

This first bit Invoice Pay (Phase 7), whose context defaulted `:handwritten-date (time/now)`.
Rules of thumb:
- **Context**: store only EDN-safe primitives (numbers, strings, keywords, vectors, maps,
  `#inst`/`java.util.Date`). Compute clj-time defaults in the *render* fn, not in context.
- **Step data**: a `clj-time` value decoded by a step is fine *in memory* on the terminal
  (`:done`) path — `get-all` reads it before `forget` clears the wizard, so it never reaches
  the cookie. It only bites if a clj-time value survives in a step that gets re-persisted
  (a non-terminal `put-step`). When in doubt, decode dates to `#inst` or keep them as strings
  until the done-fn.
- The old `mm` wizard dodged this because it read its EDN snapshot with
  `clojure.edn/read-string {:readers clj-time.coerce/data-readers}` (see `multi_modal.clj`) —
  the cookie store has no such readers. (A durable/typed session backend would remove this
  constraint; until then, EDN-safe is the rule. See `form-vs-wizard.md` open question.)

## A bare `[:map …]` query-schema 500s on empty query-params (the `{}`→nil trap)

`auto-ap.ssr.utils/main-transformer` includes `parse-empty-as-nil`, whose **`:map` decoder
turns any map with no truthy values into `nil`** (`(if (seq (filter identity (vals m))) m nil)`).
So `(mc/coerce [:map [:k {:optional true} …]] {} main-transformer)` decodes `{}` → `nil`,
then validates `nil` against `[:map …]` → `:malli.core/invalid-type` → **500**.

Ring's `wrap-params` sets `:query-params` to `{}` (not nil) for a request with no query
string. So **any handler wrapped with `wrap-schema-enforce :query-schema [:map …]` 500s on a
PUT/POST that carries no `?query`** — `(and query-schema query-params)` is truthy for `{}`,
so the coercion runs and blows up. This is exactly why the pre-migration New Invoice
basic-details "Save" was broken: its button `hx-put`s `/invoice/new/navigate` (no `?to`), and
`mm/next-handler`'s `[:to {:optional true} …]` query-schema 500d every time (the
`CustomNext`/308-to-submit logic never even ran).

- A `[:maybe [:map …]]` query-schema survives (`nil` is valid) — that's why the *grid*
  query-schema, hit by the same empty POST, doesn't throw.
- **The engine sidesteps this entirely**: `handle-step-submit` is a POST with **no**
  query-schema, so empty query-params never reach a `[:map]` coercion. Migrating a wizard
  off the `mm` navigate route *removes* the bug; you don't need to fix the old route.

## Keep wizard dates as `#inst`, not clj-time, in step-data

Reinforcing the EDN-safety rule above: a new+edit wizard that stores dates across a
non-terminal step (New Invoice: `basic-details` holds `:invoice/date` while you visit
`accounts`) must keep them **EDN-safe**. Decode them to `java.util.Date` (`coerce/to-date`)
before they land in step-data, and coerce back to clj-time only for display
(`coerce/from-date` → `atime/unparse-local`). A helper that maps over the date keys
(`->edn-safe-dates`) right after `mc/decode` is the clean seam — both the step `:decode` and
the edit `:init-fn` run the posted/persisted map through it. Datomic's upsert wants
`java.util.Date` anyway, so the done-fn needs no extra conversion.

## The `{}`→nil trap has a THIRD face: empty-step decode → validation "invalid type"

Beyond query-params (Phase 8) and route-params (Phase 9's `/navigat`), the same
`parse-empty-as-nil` `:map` decoder bites a wizard step whose fields are all blank: an
all-empty step posts only blank inputs → the decoded all-nil map collapses to `nil`. If that
`nil` then flows into a `:validate` that does `(mc/validate step-schema data)`, validation
fails with `[invalid type]` (nil isn't a map) and the step can never advance — even though
every field is optional. The legal/address steps (all-optional) hit this.

Fix at the seam: have the step `:decode` coerce nil back to `{}`:
```clojure
(defn- decode-with [schema request]
  (or (mc/decode schema (... nested form-params ...) main-transformer) {}))
```
Now an optional-only step validates `{}` (passes, advances) while a required-field step
(e.g. account needs `:vendor/default-account`) still fails on the *missing key*, not on a
spurious nil. Don't "fix" it by skipping validation when data is nil — that lets a genuinely
empty required step through.

## A new (db/id-less) nested entity with all-nil fields → datomic "tempid used only as value"

The empty Address step decodes to `{:vendor/address {:address/street1 nil, …}}` — a map of
nils with no `:db/id`. `:upsert-entity` mints a tempid for that nested map but, since every
attribute is nil, the address entity has nothing transacted, so the tempid is referenced as
a ref value but never defined → `:db.error/tempid-not-an-entity … used only as value`. Drop
such blank nested maps before the upsert:
```clojure
(defn- blank-address? [a] (and (map? a) (not (:db/id a)) (every? nil? (vals a))))
```
This is the nested-entity analogue of "don't create empty rows"; the engine's `blank-row`
gives *added* rows a tempid, but a never-touched optional nested entity must be elided.