Inside AI Categorization: 95% accuracy with your own taxonomy

A Maison Orion ad scanned by AI — jewelry, material and model attributes detected with full reasoning attached.

Why off-the-shelf taxonomies don't work

Every brand team we work with has its own way of slicing the world. A telco doesn't want "Phones" — it wants "Devices", "Tariffs", "Data Campaigns", "TV Services" and "Mobile Services". A fragrance house doesn't want "Beauty" — it wants "Heritage Fragrance", "Limited Edition", "Gifting" and "Influencer Hero". Forcing every advertiser into a generic taxonomy is the fastest way to make competitive intelligence useless.

So AI Categorization in Ginjer.ai starts from a different premise: you tell us your categories, in your words, with your inclusions and exclusions, and our models sort every ad into them. No-one in the team has to be an ML engineer.

How the model works, briefly

Every creative — image, video, carousel, search ad copy — runs through a vision-and-language model fine-tuned on advertising content. The model produces a structured output:

The likely categories from your taxonomy.
A High / Medium / Low confidence tag per category.
Free-text reasoning explaining why those categories matched (and others did not).

That last bit is the one no-one talks about and it matters most. A category without reasoning is a guess. A category with reasoning is something your team can audit, correct and trust.

"High confidence means you can ship the chart. Medium means an analyst should look at it. Low means we should not have guessed at all."

Confidence is a feature, not a number

We chose three confidence levels deliberately. A scalar probability (0.83, 0.91…) gives the false impression that confidence is fine-grained, when in fact what users need is a clear decision rule:

High — accept automatically. The dashboard renders it as-is.
Medium — surface in the review queue. A human approves or corrects.
Low — not categorized by default. Visible under "Unknown" until reviewed.

Approvals and corrections flow back into the training set. Over the first three months of a customer relationship, that human-in-the-loop loop typically lifts accuracy from 90% on day one to 95–98% on day ninety.

The everyday workflow

From the Ad Campaigns view, the small pencil icon on a category column opens a single-ad review modal. The team sees the creative, the AI reasoning, and the suggested categories with confidence pills. One click to approve, two clicks to correct.

For bulk work, the pencil in the table header opens the same modal in a queue. Analysts can clear a hundred ads in twenty minutes — and the model never re-asks them once they've answered.

One concrete example

Here's a real ad an O2 dashboard categorized last December. The creative shows a Premier League promotion through Oneplay, bundled with O2 Spolu.

The AI surfaced two candidate categories: "Data Campaigns" (low) and "TV Services" (high).
Its reasoning: "The ad promotes Oneplay, a service for watching live sports. It does not focus on data plans, devices or mobile services."
One click to approve. The category stuck. The model never re-asked.

When AI is not enough

For e-shops and retailers, the actual product taxonomy is often baked into the URL and feed. AI Categorization can pull that signal but it's overkill — we offer a parallel Keyword Matching engine for those cases. The two systems live side by side, and you choose which one runs per category.

Why this matters

Competitive intelligence is only as good as its taxonomy. A "Skincare" tag is fine if you sell shampoo and that's your only competitive set. A luxury fragrance house comparing 14 houses, each with 9 product lines and 4 campaign archetypes, needs roughly 50 categories — and they need to be theirs. AI Categorization is what makes that scale possible.

If you want to see your own taxonomy plugged in live, request a demo.