feat/sales-parquet-migration #1

Closed
notid wants to merge 8 commits from feat/sales-parquet-migration into master
Owner
No description provided.
notid added 8 commits 2026-04-25 08:02:12 -07:00
- Add new PDF template for Bonanza Produce vendor
- Template uses phone number 530-544-4136 as unique identifier
- Extracts invoice number, date, customer identifier, and total
- Includes passing test for invoice 03881260

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Extract customer name in customer-identifier field
- Extract street address in account-number field
- Use non-greedy regex with lookahead to capture clean values
- Update test to verify both name and address extraction

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Extract city/state/zip in location field
- Customer address now split across 3 fields:
  - customer-identifier: customer name
  - account-number: street address
  - location: city, state zip
- All components verified in test

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- customer-identifier field: customer name (e.g., 'NICK THE GREEK')
- account-number field: street address (e.g., '600 VISTA WAY')
- Combined they provide full customer identification with address
- Updated test to verify both fields and their concatenation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
New repository-based skill at .claude/skills/invoice-template-creator/:
- SKILL.md: Complete guide for creating invoice parsing templates
- references/examples.md: Common patterns and template examples
- Covers vendor identification, regex patterns, field extraction
- Includes testing strategies and common pitfalls

Updated AGENTS.md with reference to the new skill.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Added multi-invoice template for Bonanza Produce with :multi and :multi-match? flags
- Template uses keywords for statement header to identify multi-invoice format
- Extracts invoice-number, date, customer-identifier (from RETURN line), and total
- Parses 4 invoices from statement PDF 13595522.pdf
- All tests pass (29 assertions, 0 failures, 0 errors)

- Added test: parse-bonanza-produce-statement-13595522
- Updated invoice-template-creator skill: emphasized test-first approach
- Add DuckDB/S3 parquet storage layer (auto-ap.storage.parquet)
- Add sales_to_parquet migration script for historical data
- Add cleanup_sales for post-migration Datomic cleanup
- Add sales_orders_new.clj with DuckDB read layer for SSR views
- Add test scaffolding for parquet storage
- Add plan document for move-detailed-sales-to-parquet
- U3: Square production (upsert) now buffers to parquet via flatten-order-to-parquet!
- U3: EzCater core import-order now buffers to parquet instead of Datomic transact
- U3: EzCater XLS upload-xls now buffers to parquet instead of audit-transact
- U4: Rewrite sales_orders.clj to read from DuckDB via pq/get-sales-orders
- U5: Rewrite sales_summaries to use parquet aggregation functions
  - get-payment-items-parquet, get-discounts-parquet, get-refund-items-parquet
  - get-tax-parquet, get-tip-parquet, get-sales-parquet
- Add sum-* aggregation functions to storage/sales_summaries.clj
  - sum-discounts, sum-refunds-by-type, sum-taxes, sum-tips, sum-sales-by-category
notid closed this pull request 2026-04-25 08:28:35 -07:00

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: notid/integreat#1