From Mixpanel Anomaly to LLM Extraction with HITL
How a non-obvious data signal reframed the problem I thought I was solving. Building AI Upload at OPENLANE.
Setup
What I saw in the data, and where the data stopped making sense.
In late January 2026 I was looking at usage data for OPENLANE (Canada's largest wholesale vehicle auction platform) inventory intake, with a hypothesis going in that dealers were uploading vehicles in bulk and we needed to make the bulk upload UI faster. Mixpanel (our product analytics tool for tracking feature usage and session behavior across the dealer surface) told a different story: bulk upload was effectively unused, even though dealers in those same sessions were adding inventory one vehicle at a time, sometimes ten or twenty in a row.
Discoverability was not the issue. The bulk upload button sat prominently next to single-vehicle upload in the same toolbar, dealers were seeing it and choosing to ignore it, which meant the friction lived somewhere outside the UI itself.
Confirmation arrived sideways. Sales reps had started collecting dealer inventory extracts and sending them to me, asking if I could get their dealers uploaded. Independent dealers, who lacked clean DMSDealer Management System. The operational software that franchise dealers use to run their dealership, tracking vehicles, deals, parts, and service. connections, were handing off their internal spreadsheets to sales because the CSV template was too much work, and I had effectively become the human ETL layer for the segment, which was the signal that told me to stop looking at the upload button and start looking at the format the dealer arrived with.
Options I considered
Four directions on the table once the data forced a reframe.
Four directions were on the table once the data forced a reframe.
1. Redesign the CSV template UI. Bulk upload required dealers to download a CSV template, fill in the right columns in the right format, then upload it, so the template itself might have been the problem and an inline editor or smarter validation might have unblocked adoption.
2. Push harder on DMS integrations. OPENLANE has ongoing conversations with DMS vendors (dealer management systems, the operational software franchised dealers run on) to ingest inventory directly, and if those land bulk upload becomes irrelevant, but DMS integrations are slow, partnership-dependent, and uneven across segments.
3. Leave bulk upload alone. Treat it as a power-user feature, accept that most dealers will use single uploads, and move on to the next problem.
4. Use LLM extraction to handle messy dealer inputs at ingestion. Stop requiring the CSV template format and accept whatever the dealer has (an internal sheet, a vendor extract, a PDF), letting an LLM map it to the canonical schema (the internal data structure our inventory system expects: VIN, make, model, year, mileage, condition, cost, and related fields, normalized so downstream pricing and analytics work on a consistent shape).
The decision and why
Why AI Upload won, and how I knew quickly.
A one-hour prompt experiment confirmed option 4 was viable. The friction was never the bulk upload UI itself, it was the input normalization tax dealers had to pay before they could use bulk upload at all. Dealers were not refusing bulk upload, they were refusing to do the off-platform cleanup work it required, and single uploads were friction too but at least that friction stayed inside the product.
The validation experiment itself was almost embarrassingly fast. I took three different dealer extracts (a messy Excel export from one of our independents, a multi-tab spreadsheet from a franchise group, and a PDF from an auction house) and ran them through an OPENLANE-approved LLM with a simple extraction prompt, and got clean canonical output across all three in about an hour. I immediately shared the output with the sales reps who had been forwarding me dealer files, so they could use the same pattern themselves to help dealers upload in the interim while I scoped and built the real solution.
I originally called this feature "Upload Anything" in the spec, and a senior engineering leader flagged the malicious-prompt-injection risk in review. He was right, so I changed the name to "AI Upload" and tightened the input handling scope, because the original name had been writing a check the safety design could not yet cash.
The right move was option 4: AI Upload, a multi-format intake that accepts whatever dealers actually have (Excel, PDF, image, and dealer IMS exports, where IMS refers to the inventory management system dealers use to track vehicles on their lot, though many independents run on a spreadsheet), uses LLM extraction to produce structured rows, and walks the dealer through inline resolution before anything commits to inventory.
- The LLM removes the formatting tax. Dealers stop having to clean data before uploading. Whatever format they have gets normalized on the way in.
- Column mapping keeps the dealer in control. After extraction, unmapped columns surface for the dealer to assign or exclude. Nothing gets silently dropped or misclassified.
- The inline resolution step catches what the LLM misses. Rows with incomplete data flag in an outcome stack. Dealers fix them inline before the import commits. No separate review surface, no context switching.
The longer-term case for this feature goes beyond the upload flow. Independents who do not pay for a DMS and have no integration on our roadmap will always need a way to keep their inventory current. AI Upload is how that segment stays active on the platform day to day. For many of them, it is the practical alternative to entry-by-entry upload they have always lacked.
Figures are conceptual and illustrative, shown to communicate design logic rather than visual branding.
The trade-offs I accepted
One real cost and what I was honest about.
The honest trade-off is that this feature is a bridge for one segment and a destination for another. Long-term, the cleanest outcome is DMS integrations for every dealer and no file uploads required, and for franchise dealers and groups that path is real and already underway. For independents, the road is likely much longer and many will never have a DMS integration worth building, which means AI Upload is not a stopgap for them but probably the permanent solution. I named that upfront so the team understood that "replace this with a DMS integration later" was only true for part of the customer base.
The LLM itself has held up well across a wide range of extract formats. We make a call for every upload that is not rejected outright at intake, and accuracy on mapping to canonical fields has been solid enough that the extraction problem turned out to be more tractable than I expected going in.
The consequence
What success looks like, and the secondary signal worth tracking.
AI Upload shipped to dealers in early May 2026. The data is still early: sales cycles peak in April and May, so most of the target cohort's inventory was already uploaded before the feature launched. Dealers who have used it report the process is meaningfully easier, and intake volume is rising in that cohort.
The two signals we are tracking in Mixpanel to confirm the reframe worked:
- Bulk intake volume rising. If dealers are using AI Upload, multi-vehicle upload counts go up. The more meaningful version of this is not the raw count but whether dealers return to the flow.
- Clustered single-vehicle uploads falling. If the new intake works, the pattern of doing twenty single uploads in a row declines. Dealers were doing that because they had no better option. That is the behavior this feature replaces.
The intake problem is solved. The next question is data quality: getting cost data to flow through dealer extracts so it feeds pricing model accuracy and Market Guide signal quality downstream. The better the intake, the better every tool that touches that inventory.
Figures are conceptual and illustrative, shown to communicate design logic rather than visual branding.
The outcome stack was deliberately designed without a staging area for human review before vehicles commit to inventory. We made this choice because OPENLANE has a separate quality gate downstream. When a dealer attempts to list a vehicle for auction, it goes through an inspection and data review process before it is accepted. Vehicles with bad data get rejected at that point with specific context on what needs fixing. Because that gate exists, we could give dealers flexibility in the inventory management surface rather than forcing a review step at upload. Inline editing in the table handles the rest.
A secondary signal worth tracking is financial field completion on uploaded inventory. When dealer extracts arrive with cost data attached, MyLotOPENLANE's inventory management surface for dealers. The place where dealers manage, appraise, and list vehicles for auction or retail. becomes more useful, switching costs rise, and the pricing model and Market Guide signal both benefit from richer inputs. Better intake compounds through every tool downstream.
What I would do differently
Where I lost time, and what I would do the same again.
I would have validated the LLM extraction path a month earlier than I did. The Mixpanel data was already there, the clustered-single-upload pattern was visible, and the validation experiment that ultimately took an hour could have been run on any quiet afternoon. That month was the cost of waiting for permission I did not actually need.
I would also have been more explicit upfront about which dealer segments AI Upload is durable for. It is most valuable for independents who may never get clean DMS integrations, while for franchises it is a bridge until DMS integrations land, and naming that segment-by-segment value would have sharpened the conversation with sales and made the prioritization debate easier.
The pattern worth carrying forward is this: when usage data contradicts the story you are telling yourself about how the product works, the data is the story. Reframe before you redesign.