0

Xinyu Zhou

Papers
25

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
25papers

Authored papers

25

Attention Residuals

arXiv 2026

2026

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

arXiv 2026

2026

GigaWorld-Policy: An Efficient Action-Centered World--Action Model

arXiv 2026

2026

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models

arXiv 2026

2026

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

arXiv 2026

2026

Towards Pixel-Level VLM Perception via Simple Points Prediction

arXiv 2026

2026

Kimi K2.5: Visual Agentic Intelligence

arXiv 2026

2026

YuE: Scaling Open Foundation Models for Long-Form Music Generation

arXiv 2025

2025

Muon is Scalable for LLM Training

arXiv 2025

2025

Kimi-Audio Technical Report

arXiv 2025

2025

MoBA: Mixture of Block Attention for Long-Context LLMs

arXiv 2025

2025

Kimi-VL Technical Report

arXiv 2025

2025

Kimi k1.5: Scaling Reinforcement Learning with LLMs

arXiv 2025

2025

G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning

arXiv 2025

2025

MoonCast: High-Quality Zero-Shot Podcast Generation

arXiv 2025

2025

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

arXiv 2025

2025

Kimi Linear: An Expressive, Efficient Attention Architecture

arXiv 2025

2025

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

ICCV 2025

2025

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

arXiv 2024

2024

Me LLaMA: Foundation Large Language Models for Medical Applications

arXiv 2024

2024

TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning

arXiv 2024

2024

NBMOD: Find It and Grasp It in Noisy Background

arXiv 2023

2023

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

arXiv 2023

2023

Few shot font generation via transferring similarity guided global style and quantization local style

ICCV 2023 1

2023

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

arXiv 2016

2016

Affiliations

No known affiliations.

Frequent co-authors

10

from 25 papers