Wenhui Wang
- Papers
- 10
Cite
Notes
Only stored in your browser.
10papers
Authored papers
10VibeVoice Technical Report
arXiv 2025
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
arXiv 2025
BitNet Distillation
arXiv 2025
Multimodal Latent Language Modeling with Next-Token Diffusion
arXiv 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
arXiv 2024
Multi-Head Mixture-of-Experts
arXiv 2024
Kosmos-2: Grounding Multimodal Large Language Models to the World
arXiv 2023
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
arXiv 2022
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
arXiv 2021
Distilled Dual-Encoder Model for Vision-Language Understanding
arXiv 2021
Affiliations
No known affiliations.
Frequent co-authors
10from 10 papers