Optimizing metadata quality, taxonomy, and discoverability
RED HAT | METADATA ASSISTANT
Video demo:
AI Metadata Assistant
in action.
As part of the Content Team that developed Red Hat’s Metadata Assistant, I evaluated AI-generated metadata outputs, scored response quality, and contributed feedback used to improve taxonomy accuracy and model performance.
THE CHALLENGE
Manual metadata creation was inconsistent, time-consuming, and impacted discoverability and SEO.
THE APPROACH
Develop an AI Metadata Assistant and use human-in-the-loop evaluation workflows to improve the accuracy, usability, and consistency of AI-generated metadata outputs.
THE IMPACT
-
Better metadata accuracy
-
Reduced overtagging
-
improved information architecture
-
Stronger SEO
AI Content Systems+ Human Feedback
Step-by-step walkthrough from input to AI suggestions and human feedback.
ABOUT THE METADATA ASSISTANT
Created by the Content Team, the Metadata Assistant is a
generative AI tool that suggests metadata and taxonomy tags for Red Hat marketing content and web pages.
It uses NLP and machine learning to generate titles, meta descriptions, summaries, and taxonomy tags aligned to the MIST (Metadata Initiative for Structured Taxonomy) guidelines.
Our goal is to reduce overtagging, improve consistency, and save creators time—while maintaining high-quality metadata.
-
Enhanced discoverability, personalization, and SEO
-
Improved metadata accuracy and taxonomy consistency
-
Human + AI feedback loop drives continuous improvement
-
Streamlined workflows and reduced manual burden
-
Reduced overtagging and duplicate tags
WHAT I DID (KEY OUTCOMES)
HOW I DID IT (SIMPLIFIED)
1. Input content
2. AI generates
Paste text or URL from
a webpage or PDF.
Suggestions for titles,
descriptions, summaries, and taxonomy tags.
3. Human eval
4. Feedback loop
Score quality against MIST guidelines—adding feedback as needed
Qualitative feedback improves the models' performance and future suggestions.
5. Better metadata
More accurate tags, stronger AI, better SEO and content performance.
Without accurate metadata
A user searching for VMware virtualization help got 358 results. Adobe Target served AI content to someone who needed virtualization docs. Personalization became noise, not help.
With accurate metadata
Taxonomy tags powered search filters, personalization affinity scoring, content recommendations, and audience segmentation — all dependent on human evaluation keeping the model accurate.
WHY METADATA ACCURACY MATTERED AT THIS SCALE
Bad metadata wasn't just an SEO problem; it broke personalization, flooded users with irrelevant content, and damaged brand trust. The human evaluation layer existed to prevent that.
ENTERPRISE SCALE—WHAT THE EVALUATION WORK WAS PROTECTING
The Metadata Assistant operated across redhat.com's entire content ecosystem. The top 20% of collateral alone drove the majority of business value—accurate metadata was what kept that content findable.
89%
of all site interactions driven by top 20% of content—dependent on accurate taxonomy.
~100%
of Google search clicks to collateral came from top-tagged content.
$100M+
in opportunity value tied to content discoverability powered by accurate taxonomy.
WHAT I EVALUATED—FIELDS SCORING CRITERIA
Each evaluation session involved scoring AI-generated outputs across multiple metadata fields against MIST taxonomy guidelines. Knowing the rules well enough to score them accurately required deep understanding of the prompt architecture—even as an evaluator, not a prompt author.
Meta title
Meta description
Summary
Primary product tag
Topic tag
Industry tag
Business challenge
Product line
Partners tag
Services tag
less is more
vs MIST guidelines
most specific wins
near the beginning
why it failed
when wrong
per field
recurring errors
TOOL OUTCOMES—WHAT THE HUMAN FEEDBACK LOOP PRODUCED
User survey and evaluation data from the Metadata Assistant rollout — team-level outcomes that the human evaluation workflow made possible.
100%
of users would recommend the tool to another
Red Hatter.
65%
rated AI-generated metadata
as excellent or good.
10-20
minutes saved per tagging task across
the content team.