Patronus AI Launches Industry-First Multimodal LLM-as-a-Judge for Image Evaluation
E-commerce giant Etsy already leveraging technology to reduce AI hallucinations in product image captions
The new Judge-Image tool, powered by Google Gemini, allows AI engineers to iteratively measure and improve the quality of their multimodal AI applications by scanning for text presence, grid structure, spatial orientation, and object identification.
"Our mission has always been to advance scalable oversight of AI," said
The Judge-Image tool offers several out-of-box evaluation criteria, including:
- Caption hallucination detection (standard and strict)
- Primary and non-primary object description verification
- Object location accuracy
Beyond validating image caption correctness, Judge-Image can test OCR extraction accuracy for tabular data, AI-generated brand asset accuracy, and scene description validity.
Prior research suggests that Google Gemini can serve as a more reliable MLLM judge compared to alternatives like OpenAI's GPT-4V, exhibiting less egocentricity and a more equitable approach to judgment. Patronus AI's internal evaluation datasets confirmed that the Gemini backbone performed better compared to other multimodal LLMs.
Patronus AI plans to expand their multimodal evaluation capabilities to include audio and vision features in future releases.
Customer Use Case
Etsy, the leading technology marketplace for independent sellers, has already implemented Patronus AI's MLLM-as-a-Judge to detect and mitigate caption hallucination from their product images. The Etsy AI team leverages this and the broader Patronus platform to optimize their multimodal AI system.
For more information, visit the Patronus AI documentation at https://docs.patronus.ai/docs/multimodal_evals/base.
About Patronus AI
Patronus AI develops AI evaluation and optimization to help companies build top-tier AI products confidently. The company was founded by machine learning experts
View original content:https://www.prnewswire.com/news-releases/patronus-ai-launches-industry-first-multimodal-llm-as-a-judge-for-image-evaluation-302401213.html
SOURCE Patronus AI
Serious News for Serious Traders! Try StreetInsider.com Premium Free!
You May Also Be Interested In
- Democratic Republic of Congo Launches Landmark "Invest in the DRC" Advertising Campaign in United States
- Ethereum News: Why Pepeto’s $10.3 Million Presale Grows While ETH Restructures
- Bypass Feeders Ship From Stock With One-Week Backorder Cap
Create E-mail Alert Related Categories
PRNewswire, Press ReleasesSign up for StreetInsider Free!
Receive full access to all new and archived articles, unlimited portfolio tracking, e-mail alerts, custom newswires and RSS feeds - and more!



Tweet
Share