Yiyuan Zhang

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

Seed1.5-VL Technical Report

arXiv 2025

2025

Native-Resolution Image Synthesis

arXiv 2025

2025

Multimodal Long Video Modeling Based on Temporal Dynamic Context

arXiv 2025

2025

OneThinker: All-in-one Reasoning Model for Image and Video

arXiv 2025

2025

Transition Models: Rethinking the Generative Learning Objective

arXiv 2025

2025

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

CVPR 2024 1

2024

Explore the Limits of Omni-modal Pretraining at Scale

arXiv 2024

2024

Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines

arXiv 2024

2024

Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

arXiv 2024

2024

InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions

arXiv 2024

2024

OneLLM: One Framework to Align All Modalities with Language

CVPR 2024 1

2023

Meta-Transformer: A Unified Framework for Multimodal Learning

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Xiangyu Yue

Xiaohan Ding

Kaixiong Gong

Wanli Ouyang

Jiaming Han

Kaipeng Zhang

Lei Bai

Yu Qiao

Zhixin Zhang

Zidong Wang