0

Guangtao Zhai

Papers
35

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
35papers

Authored papers

35

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

arXiv 2025

2025

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

arXiv 2025

2025

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

arXiv 2025

2025

Teaching LMMs for Image Quality Scoring and Interpreting

arXiv 2025

2025

LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs

ICCV 2025

2025

Redundancy Principles for MLLMs Benchmarks

arXiv 2025

2025

Omni$^2$: Unifying Omnidirectional Image Generation and Editing in an Omni Model

arXiv 2025

2025

FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

arXiv 2025

2025

PriceSeer: Evaluating Large Language Models in Real-Time Stock Prediction

arXiv 2025

2025

G^2RPO: Granular GRPO for Precise Reward in Flow Models

arXiv 2025

2025

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content

CVPR 2025 1

2025

R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?

arXiv 2024

2024

AIM 2024 Challenge on Video Saliency Prediction: Methods and Results

arXiv 2024

2024

Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark

arXiv 2024

2024

Q-Bench+: A Benchmark for Multi-modal Foundation Models on Low-level Vision from Single Images to Pairs

arXiv 2024

2024

A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

arXiv 2024

2024

Dual-Branch Network for Portrait Image Quality Assessment

arXiv 2024

2024

ASCIIEval: Benchmarking Models' Visual Perception in Text Strings via ASCII Art

arXiv 2024

2024

LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models

arXiv 2024

2024

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results

arXiv 2024

2024

Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model

arXiv 2024

2024

Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric

arXiv 2024

2024

VQA$^2$: Visual Question Answering for Video Quality Assessment

arXiv 2024

2024

On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection

arXiv 2024

2024

Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

arXiv 2024

2024

GAIA: Rethinking Action Quality Assessment for AI-Generated Videos

arXiv 2024

2024

THQA: A Perceptual Quality Assessment Database for Talking Heads

arXiv 2024

2024

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

arXiv 2024

2024

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

arXiv 2024

2024

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

arXiv 2023

2023

AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment

arXiv 2023

2023

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

CVPR 2024 1

2023

AccFlow: Backward Accumulation for Long-Range Optical Flow

ICCV 2023 1

2023

AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence

arXiv 2023

2023

Exploring the Naturalness of AI-Generated Images

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 35 papers