0

Wengang Zhou

Papers
23

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
23papers

Authored papers

23

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

arXiv 2025

2025

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

arXiv 2025

2025

Uni-Sign: Toward Unified Sign Language Understanding at Scale

arXiv 2025

2025

Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

arXiv 2025

2025

Robust Multimodal Large Language Models Against Modality Conflict

arXiv 2025

2025

ROOT: VLM based System for Indoor Scene Understanding and Beyond

arXiv 2024

2024

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?

arXiv 2024

2024

Sinkhorn Distance Minimization for Knowledge Distillation

arXiv 2024

2024

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

arXiv 2024

2024

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser

arXiv 2024

2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

arXiv 2024

2024

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding

arXiv 2024

2024

EG4D: Explicit Generation of 4D Object without Score Distillation

arXiv 2024

2024

Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning

arXiv 2024

2024

DIRE for Diffusion-Generated Image Detection

ICCV 2023 1

2023

Hybrid and Collaborative Passage Reranking

arXiv 2023

2023

Masked Motion Predictors are Strong 3D Action Representation Learners

ICCV 2023 1

2023

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs

arXiv 2023

2023

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection

ICCV 2023 1

2023

Semantic Image Synthesis via Diffusion Models

arXiv 2022

2022

Geometric Representation Learning for Document Image Rectification

arXiv 2022

2022

DocScanner: Robust Document Image Rectification with Progressive Learning

arXiv 2021

2021

Uformer: A General U-Shaped Transformer for Image Restoration

CVPR 2022 1

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 23 papers