EgoNormia

Can large models make normative decisions in physical-social embodied situations?

Verified Split: A high-quality subset of 200 videos with full agreement on the correct answers among 5 independent annotators.

Input Modality Types:

  • Video: Models receive both video input (1 fps, concatenated into a single image) and questions
Model
Modality
Both
Act
Jus
Sen
Date
Google logo - light
🥇 Gemini 2.5 Pro (05-06-2025 Preview)
Google
Video67.874.468.956.72025-05-20
Google logo - light
🥈 Gemini 2.5 Flash (04-17-2025 Preview)
Google
Video58.469.759.658.92025-05-20
OpenAI logo - light
🥉 o4-mini
OpenAI
Video58.366.766.764.62025-05-20
Google logo - light
Gemini 2.0 Thinking
Google
Video50.070.650.056.12025-05-20
Google logo - light
Gemini 1.5 Pro
Google
Video49.056.550.561.82025-05-20
Google logo - light
Gemini 1.5 Flash
Google
Video48.053.050.556.82025-05-20
Alibaba logo - light
Qwen2.5 VL (72B)
Alibaba
Video47.057.548.068.22025-05-20
OpenAI logo - light
GPT-4.1
OpenAI
Video46.450.050.057.72025-05-20
OpenAI logo - light
GPT-4o
OpenAI
Video45.553.050.062.72025-05-20
Alibaba logo - light
QWQ-32B
Alibaba
Video37.537.537.539.62025-05-20
Anthropic logo - light
Claude 3.7 Sonnet
Anthropic
Video33.340.041.740.82025-05-20
Anthropic logo - light
Claude 3.5 Sonnet
Anthropic
Video22.727.327.347.72025-05-20
S
InternVL 2.5
Shanghai AI Lab
Video13.016.515.052.12025-05-20
Meta logo - light
Llama 3.2
Meta
Video4.018.010.555.62025-05-20