FoxBurrowAI | AI data services for procurement

Why this matters

Why businesses benefit from clean, well-categorised spend data

Most organisations have years of invoices, POs, and supplier payments - but very few have a clear view of what that money is actually buying. When thousands of lines sit in generic, inconsistent, or overly broad categories, opportunities remain hidden. Good categorisation turns raw transactions into usable insight.

Spot genuine savings opportunities by seeing exactly where spend is going, not just broad buckets like “services” or “facilities”.
Identify price variation and over-charging - even small inconsistencies compound across hundreds of suppliers and thousands of lines.
Strengthen supplier negotiations with transparent, comparable, like-for-like category-level data.
Make budgets more reliable by understanding cost drivers rather than relying on historical spend alone.
Enable dashboards and analytics that uncover trends, cost spikes, waste, and consolidation opportunities.

Quick takeaway

Why organisations invest in this

Clean categorisation often pays for itself quickly: even a 1–3% improvement in visibility across a multi-million-pound spend base translates into material savings, reduced waste, and better decision-making.

Without it, organisations end up running procurement with blurred vision - and miss the very areas where efficiencies and improved value lie.

Comparisons

FoxBurrowAI vs human analysts vs “just ask an LLM”

Categorisation is only useful if it holds up under real scrutiny — especially at deeper levels like L3/L4 where mistakes are easy to hide. Below are two practical comparisons.

FoxBurrowAI vs human

Competitive with careful human work

On a real public-sector invoice dataset with existing manual categories, FoxBurrowAI produced results that were comparable to — and in places more consistent than — a diligent but non-specialist human analyst working line-by-line.

Almost every line landed in a sensible category even when original coding was inconsistent.
Remaining issues were mostly driven by ambiguous or generic line descriptions (input quality).
Explanations can be attached to each line so reviewers can validate decisions and adjust quickly.

In practice, this is the difference between “months of effort” and “review and refine”.

FoxBurrowAI vs direct LLM prompting

Why “type it into an AI chat box” isn’t enough

A useful baseline is what happens when you rely on a general-purpose model to categorise items from the front end with prompt tweaks. Reported results in the literature show that while L1 can be passable, accuracy collapses as you go deeper — exactly where procurement teams need precision.

Comparison	L1 match	L2 match	L3 match	L4 match
Prompt-engineered GPT-4 (cleaned dataset)	54.59%	40.31%	29.01%	10.8%

At L4, ~10% matching is not fit-for-purpose: it forces heavy manual correction, destroys trust, and makes dashboards misleading. FoxBurrowAI is built to behave like an analyst — producing consistent, reviewable decisions rather than a one-shot guess.

Data realism

Cleaned data can inflate results — real exports are messier

Many published tests use cleaned, cherry-picked, or “best case” rows. In real procurement exports, descriptions are inconsistent, columns go missing, and suppliers use vague language. That’s why a production approach needs guardrails, consistency, and review support — not just a clever prompt.

Why this matters in practice

The peas that were categorised as “green beans”

In one public dataset, a line labelled by the original team as “green beans” turned out – on closer inspection of the description – to be garden peas.

It is likely whoever was working on this was either just incorrect or time constrained in categorising many of these rows. This is not uncommon in procurement classification, especially when being tasked to go all the way to L4. Across thousands of rows, these shortcuts add up, especially at deeper category levels, making it impossible to know you are over paying for your specific veg.

FoxBurrowAI focuses on the actual text on the line – “peas”, not just “fresh veg” – and works systematically. That means you can finally see where spend is really going:

Which specific vegetables are being bought
Which items are creeping up in price over time
Where suppliers might be over-charging on a narrow slice of spend

Approach

Reasoning not regurgitation

Unlike traditional auto data categorisation scripts or methodologies, we do not rely entirely on past categorisations and vague matches. If the very normal errors you saw in the Californian Government’s efforts are propagated throughout your future processes, you will quickly find your accuracy in categorisation dropping.

This is also a major concern for any overly specific machine learning techniques trained on large existing datasets. Many companies see large data histories as an invaluable resource, but unless you are entirely comfortable in that data quality, the “garbage in, garbage out” effect will quickly apply with AI.

Advanced reasoning at the point of discovery proves to be far more effective on unseen data than simply copying old patterns forward.

Choices & explanations

Ranked options instead of a single guess

Many descriptions have more than one reasonable home. In those cases, FoxBurrowAI can provide a ranked set of category options per line rather than a single forced decision.

The workflow can propose several plausible paths through the hierarchy and attach a brief justification for each. Review teams then see:

The primary category FoxBurrowAI recommends
Alternative categories that would also make sense
Short, human-readable explanations for each choice

This keeps control firmly with you while removing the heavy lifting of thinking up categories from scratch.

Security & deployment

Online or entirely offline – your choice

Some organisations are happy for data to leave their network; others aren’t. FoxBurrowAI can support both:

A hosted workflow for standard projects, managed by FoxBurrowAI.
A fully offline deployment option is avaliable depending on your requirements. Your data will not be used to train models either way and will be securely stored.

Either way, the goal is the same: reliable categorisation that your teams can trust, with appropriate guardrails and auditability.

Working with your data

Built to cope with the reality of Excel exports.

Real procurement data rarely comes as a neat, standardised table. You get multi-sheet workbooks, varying column names, different date formats, free-text descriptions, and the occasional “mystery” column that still turns out to matter.

FoxBurrowAI’s pipeline can automatically detect key fields (supplier, description, amount, date) across a wide range of Excel layouts.
You don’t need to pre-map everything to a fixed template first – the workflow is designed to perform that time intensive work for you.
For particularly complex files, a light touch of one-off setup makes subsequent refreshes straightforward.

Want to see how your own data behaves? Share a sample file and we’ll run it through the process, returning a categorised extract and an example dashboard.

Dashboards

Custom dashboards with no logins or subscriptions

The dashboards on this site are built with Plotly Dash and designed from the ground up for procurement use. They support:

Time trends by month and year
Hierarchical category drill-down (L1–L4)
Key supplier views and summary metrics
One-click export of whatever you’re currently viewing

As part of a paid plan, FoxBurrowAI can provide self-contained, offline dashboards tailored to your data. They open in a web browser, require no user accounts, and don’t depend on shared cloud servers.

That makes them ideal for sharing with non-technical stakeholders or in environments where traditional BI tools and user management are a constant friction.

Explore the example dashboards →

Next step

Want to burrow deeper into your data?

Whether it’s a year of invoices or a multi-year, multi-entity dataset, FoxBurrowAI can turn it into clean, sensible categories and ready-to-use dashboards. Pricing typically starts from around £0.075 per line for categorisation, with options for one-off projects or ongoing refreshes.

Start a conversation
View example dashboards

AI categorisation that reasons like a careful analyst.