Dongxu Li

Papers: 11

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

11papers

Authored papers

GTA1: GUI Test-time Scaling Agent

arXiv 2025

2025

ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks

arXiv 2025

2025

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

arXiv 2024

2024

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

CVPR 2025 1

2024

Aria: An Open Multimodal Native Mixture-of-Experts Model

arXiv 2024

2024

Aria-UI: Visual Grounding for GUI Instructions

arXiv 2024

2024

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

arXiv 2024

2024

cosFormer: Rethinking Softmax in Attention

cosformer-rethinking-softmax-in-attention

2022

The Devil in Linear Transformer

arXiv 2022

2022

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

CVPR 2022 1

2021

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

arXiv 2019

2019

Affiliations

No known affiliations.

Frequent co-authors

from 11 papers

Junnan Li

Bei Chen

HaoNing Wu

Ziyang Luo

Caiming Xiong

researcher

Hongdong Li

Lingpeng Kong

Liyuan Pan

Weixuan Sun

Yan Yang