SPREAD's deployment at the customer is Ticket Analyzer at the warranty-fraud surface. It ingests warranty claim tickets, configuration data, repair histories, and the auxiliary signals needed to score a claim for fraud likelihood. Output: a reviewer queue ordered by fraud likelihood, with supporting signals attached. 256,000 claims processed. Live in production.
What lives at production scale
The customer manages warranty claims at scale across the US service network. The warranty-fraud problem is structural in any large automotive aftersales operation: legitimate claims and fraudulent claims share the same form factor, and review at scale exceeds human reviewer capacity by an order of magnitude. The economics are unambiguous. Each undetected fraudulent claim is a direct dollar leak. Each falsely-flagged legitimate claim is a customer-experience cost.
What changes for the reviewer
Before Ticket Analyzer, the reviewer scans a first-in-first-out queue: claims arrive randomly mixed, fraudulent claims are not visually distinguishable from legitimate ones, and the reviewer's hours distribute across the homogeneous stream. With Ticket Analyzer, the same queue is ranked by fraud likelihood with the supporting signals attached.
For the warranty reviewer, the workflow change is concrete: the queue is no longer first-in-first-out across a homogeneous claim stream; it is ranked. The reviewer's hours compress to the higher-yield end of the distribution.
The value math behind the figures
What 256,000 claims means in scale terms. Industry-benchmark figures (the customer's specific warranty spend is not public):
Industry-typical fraud rate on automotive warranty claims runs roughly 3–10% depending on segment, dealer network maturity, and detection rigor. A one-percentage-point improvement in fraud detection at this scale equates to roughly $2.5M annually in recovered or avoided payouts.
Reviewer-time math sits behind the queue ranking. 256K claims at industry-typical 5–15 minutes per manual review would be 21,000–64,000 reviewer-hours per year at parity. Production scale means most claims aren't manually reviewed; the queue ranking is what makes selective review economic. Compressing reviewer attention to the top decile of fraud-likelihood means the same headcount produces materially more recovered fraud per shift.
Numbers are illustrative to industry benchmarks; SPREAD does not publish the customer's specific fraud-detection efficacy figures.
The bug the customer surfaced
The customer's warranty reviewer raised a UX bug on 29 April 2026 in the customer's feedback channel:
"I noticed that the filters still aren't working right so like I'll go in and I'll filter to unreviewed and o…" (truncated in capture)Warranty reviewer · April 2026
The specific issues: filters in the unreviewed-claims view don't reset properly; reviewed items reappear; the total claim count in the header doesn't update when filters are applied. None of these are model-quality issues. All of them are reviewer-trust issues, the kind of UX defects that earn daily reviewer trust.
What's next on the deliverable list
The substantive engineering work in flight is the reviewer-UX surface: the filter behavior in the unreviewed-claims view, the broader queue ergonomics, and the jobs-to-be-done assessment that determines which UX defects to prioritize first. Model quality is not the issue; reviewer trust in the interface is.
Where this engagement sits
A production-scale deployment of a fraud-ranking pipeline against 256,000 claims, with the model producing the value the engagement was scoped against. The next deliverable is the reviewer-facing UX: closing the filter and queue-state defects the customer's warranty reviewer raised, so the ranked queue earns the same trust as the ranking model.
Program shape
| Program | Warranty Fraud Detection · NAM Aftersales |
|---|---|
| Claims processed | 256,000 in production |
| Workflow shift | FIFO claim queue → ranked queue |
| Value at stake | ~$2.5M annually per 1pp fraud-detection lift (industry-benchmark math) |
| Reviewer headcount | Unchanged; attention compresses to top-decile fraud likelihood |
| Next deliverable | Reviewer-UX surface (filter behavior, queue-state) |