Haorui Yu

Evaluating Vision-Language Models for Cultural Understanding in Heritage and Arts Contexts

My research focuses on developing rigorous evaluation frameworks for Vision-Language Models (VLMs) in culturally situated tasks, particularly within digital cultural heritage and arts domains. I investigate how these AI systems interpret, describe, and critique cultural imagery, identifying systematic failure modes related to cultural bias, contextual misunderstanding, and interpretive limitations.

My work has produced several key contributions: VULCA-Bench, a comprehensive benchmark for assessing VLM cultural understanding across diverse artistic traditions (under review at ACL 2026); a cross-cultural art critique evaluation framework examining how models engage with artworks from different cultural backgrounds; and an ongoing tutorial survey on Intangible Cultural Heritage (ICH) that provides end-to-end engineering guidance including data governance frameworks, system architecture patterns, and deployable evaluation checklists.

This research bridges computational approaches with humanities perspectives, aiming to make AI systems more culturally aware and responsible when deployed in heritage institutions, museums, and educational contexts. The evaluation methodologies I develop emphasise explanation quality, bias detection, and risk-sensitive failure mode identification, contributing to the broader goal of responsible AI deployment in cultural domains.

Names of Supervisors:

Primary Supervisor: Professor Natasha Lushetich
Collaborator, NLP/Argumentation: Dr Ramon Ruiz-Dolz