Model | LongVideoBench | MLVU | TimeScope | LongTimeScope | Tomato | VSIBench |
---|---|---|---|---|---|---|
Qwen2.5-VL-7B | 58.7 | 51.2 | 81.0 | 40.7 | 22.6 | 29.7 |
Dense-SFT | 57.8 (-1.5%) | 51.2 (+0.0%) | 76.8 (-5.2%) | 40.2 (-1.2%) | 21.7 (-4.0%) | 30.6 (+2.1%) |
Dense-NSA | 56.1 (-4.4%) | 51.6 (+0.8%) | 83.0 (+2.5%) | 40.9 (+0.5%) | 23.4 (+3.5%) | 33.1 (+10.7%) |
VideoNSA | 59.4 (+1.1%) | 51.8 (+1.2%) | 82.7 (+2.1%) | 44.4 (+9.1%) | 26.2 (+15.9%) | 36.1 (+20.3%) |