Shitian Zhao

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

PyVision-RL: Forging Open Agentic Vision Models via RL

arXiv 2026

2026

Sekai: A Video Dataset towards World Exploration

arXiv 2025

2025

Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning

arXiv 2025

2025

OmniCaptioner: One Captioner to Rule Them All

arXiv 2025

2025

PyVision: Agentic Vision with Dynamic Tooling

arXiv 2025

2025

TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning

arXiv 2025

2025

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models

arXiv 2025

2025

LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis

arXiv 2025

2025

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

arXiv 2024

2024

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

arXiv 2024

2024

Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models

arXiv 2024

2024

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

arXiv 2024

2024

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Kaipeng Zhang

Peng Gao

Ming Li

Hongsheng Li

Yu Qiao

Dongyang Liu

Haoquan Zhang

Le Zhuo

Renrui Zhang

Shaoheng Lin