Files
integreat/AUTOMATION_NOTES.md
2026-05-21 11:51:29 -07:00

145 lines
7.3 KiB
Markdown

# Automation Notes
Findings from investigating intermittent dialog-open failures on `/pos/summaries` (and likely other grid pages) when driven by `agent-browser`. Most of these apply equally to any browser automation — Playwright, Selenium, manual rapid-click testing.
## TL;DR
The reported "sometimes the dialog opens, sometimes it doesn't" was a server-side bug: `icon-button-` rendered as `<button>` with the HTML default `type="submit"`. Inside a `<form>` (every row in `grid_page_helper`), the click raced HTMX. If form submission won, the browser navigated to `/pos/summaries?id=…` and the modal request was canceled.
Fix is in `src/clj/auto_ap/ssr/components/buttons.clj``icon-button-` now defaults `:type "button"`. Verified with 30/30 rapid open/close cycles with random close delays spanning the entire 300 ms transition window.
## Modal lifecycle (for reference)
1. User clicks pencil → htmx `GET /pos/summaries/:id` (the edit-wizard route).
2. Server returns response with headers `hx-trigger: modalopen`, `hx-retarget: #modal-content`, `hx-reswap: innerHTML`. See `modal-response` in `src/clj/auto_ap/ssr/utils.clj:41`.
3. htmx swaps innerHTML of `#modal-content`, then dispatches a `modalopen` document event.
4. Alpine handler on `#modal-holder` (`src/clj/auto_ap/ssr/ui.clj:84`) sets `open=true`.
5. `x-show="open"` triggers a 300 ms enter transition on two nested divs (backdrop + content).
6. Closing dispatches `modalclose`, sets `open=false`, runs the 300 ms leave transition.
## Root cause of the reported flakiness
`grid_page_helper.clj:58-61` wraps each row's action buttons in a `<form>` with a hidden `id` field:
```clojure
(com/data-grid-right-stack-cell {}
(into [:form.flex.space-x-2
[:input {:type :hidden :name "id" :value ((:id-fn gridspec) entity)}]]
((:row-buttons gridspec) request entity)))
```
The buttons in `:row-buttons` come from `icon-button-`, which rendered `<button>` with no explicit type. HTML default: `type="submit"`. When the pencil is clicked:
- htmx normally intercepts via `hx-get` and calls `preventDefault()`.
- If anything (large DOM, htmx still initializing other elements, agent-browser issuing the click in a busy frame) delays htmx's listener relative to the form's submit handler, the form submits.
- Form submission triggers a same-page navigation to `/pos/summaries?id=<value>`, which cancels the in-flight XHR. The modal request never lands.
The race is non-deterministic, which is why it was intermittent. Browser automation makes it more visible because clicks fire faster than a human's, hitting moments when htmx might not yet have fully registered.
**Fix:** `icon-button-` now does `(merge {:type "button"} params)`. Same fix should be applied prophylactically to any other button helper used inside a row form: `button-`, `a-button-` (less relevant, `<a>` doesn't submit), `navigation-button-`. `group-button-` already sets `type="button"`. `validated-save-button-` correctly stays `submit`.
## Other findings (cosmetic — not causing failures)
### Duplicate `x-trap` directive
`src/clj/auto_ap/ssr/ui.clj:99-100`:
```clojure
"x-trap.inert.noscroll" "open"
"x-trap.inert" "open"
```
Both bound to the same expression. Alpine de-duplicates by directive name, so this is dead code. Drop the second line.
### Mixed `bg-opacity` and `opacity` in inner-modal transitions
`src/clj/auto_ap/ssr/ui.clj:103-107`:
```clojure
"x-transition:enter-start" "!bg-opacity-0 !translate-y-32"
"x-transition:enter-end" "!bg-opacity-100 !translate-y-0"
"x-transition:leave-start" "!opacity-100 !translate-y-0"
"x-transition:leave-end" "!opacity-0 !translate-y-32"
```
The inner div has no background color, so `bg-opacity-*` does nothing during enter. The leave correctly animates `opacity`. Net effect: enter is a translate-only animation while leave is translate-plus-fade. Asymmetric but works. Should be `opacity` on both sides for consistency.
### `_x_hidePromise` lingering flag
After rapid close→open cycles, Alpine's internal `_x_hidePromise` property remains truthy on the inner div even when the element is fully visible. It looks alarming when inspecting state but does not block subsequent transitions. Verified empirically with 30 trials.
## Browser-automation specifics
Things that aren't bugs but bite agent-browser scripts:
### Refs go stale on every HTMX swap
The `@eN` refs from a `snapshot` are valid only until the page changes. HTMX swaps innerHTML of `#modal-content` on every modal open, so any ref pointing inside the modal — or even refs pointing to row elements after the grid refreshes — silently breaks. Re-snapshot before each interaction; the docs explicitly warn about this.
### Default click is too fast for a busy frame
`agent-browser click @eN` dispatches a synthetic click via CDP without waiting for the page to settle. For htmx-driven interactions, the safe pattern is:
1. Click.
2. Wait for the observable side-effect, not for time.
For modal opens specifically:
```bash
agent-browser click @e_pencil
agent-browser wait --fn "document.querySelector('#modal-holder')?._x_dataStack?.[0]?.open && document.querySelector('#modal-content').children.length > 0"
agent-browser snapshot -i
```
For modal closes:
```bash
# After clicking a Save button that returns hx-trigger: modalclose
agent-browser wait --fn "!document.querySelector('#modal-holder')?._x_dataStack?.[0]?.open"
```
For grid refreshes after filter changes:
```bash
agent-browser wait --fn "!document.querySelector('.htmx-request')"
```
(htmx adds the `.htmx-request` class to elements during in-flight requests.)
### CDP screenshot timeouts
`agent-browser screenshot` occasionally returns `CDP command timed out: Page.captureScreenshot`. This is a Chromium/CDP issue, not application code. Workarounds:
- Don't rely on screenshots for state verification. Read state via `agent-browser eval` directly.
- If you need an image, retry once after a small wait.
### Reading Alpine state for diagnostics
Useful one-liners when debugging modal state:
```bash
agent-browser eval --stdin <<'EOF'
(()=>{
const h = document.querySelector('#modal-holder');
const c = document.querySelector('#modal-content');
const inner = c?.parentElement;
return {
open: h?._x_dataStack?.[0]?.open,
unexpectedError: h?._x_dataStack?.[0]?.unexpectedError,
contentChildren: c?.children.length,
innerDisplay: inner ? getComputedStyle(inner).display : null,
innerOpacity: inner ? getComputedStyle(inner).opacity : null,
hxRequest: !!document.querySelector('.htmx-request')
};
})()
EOF
```
## Patterns that improve reliability
When adding new interactive components:
- **Every `<button>` inside a form must declare `:type`**. Default to `"button"` for icon/utility buttons; only the actual submit needs `"submit"`. Either the component helper sets it or the call site does — never rely on the HTML default inside a form.
- **Don't dispatch `modalclose` and `modalopen` in the same tick.** They share `open` state and the result depends on order. If a flow needs to swap modals, use `modal-replace-response` (which sets `hx-trigger: modalswap` — see `src/clj/auto_ap/ssr/utils.clj:51`) so the swap goes through the `@modalswap.document` handler that explicitly sequences with `$nextTick`.
- **Prefer waiting on observable DOM/state over fixed delays.** `wait --fn` with an Alpine state check is faster and more reliable than `wait 500`.