Xinyu Zhang
- Papers
- 42
Cite
Notes
Only stored in your browser.
Authored papers
42Qwen3-TTS Technical Report
arXiv 2026
Qwen3-ASR Technical Report
arXiv 2026
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
arXiv 2026
Learning Visual Feature-Based World Models via Residual Latent Action
arXiv 2026
LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
arXiv 2026
Spectral Condition for μP under Width-Depth Scaling
arXiv 2026
Qwen3-Omni Technical Report
arXiv 2025
Qwen3 Technical Report
preprint
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving
arXiv 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
arXiv 2025
Scaling Diffusion Transformers Efficiently via $μ$P
arXiv 2025
Motion Blender Gaussian Splatting for Dynamic Scene Reconstruction
arXiv 2025
Beyond the Surface: Measuring Self-Preference in LLM Judgments
arXiv 2025
Qwen3Guard Technical Report
arXiv 2025
Co-MTP: A Cooperative Trajectory Prediction Framework with Multi-Temporal Fusion for Autonomous Driving
arXiv 2025
Qwen2 Technical Report
arXiv 2024
V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception
arXiv 2024
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
arXiv 2024
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
arXiv 2024
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
arXiv 2024
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
arXiv 2024
Security Attacks on LLM-based Code Completion Tools
arXiv 2024
EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding
arXiv 2024
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
arXiv 2024
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers
arXiv 2024
Backdoor Contrastive Learning via Bi-level Trigger Optimization
arXiv 2024
Detect Everything with Few Examples
arXiv 2023
HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
hap-structure-aware-masked-image-modeling-for
Lenna: Language Enhanced Reasoning Detection Assistant
arXiv 2023
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
arXiv 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
arXiv 2023
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models
arXiv 2023
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations
arXiv 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
arXiv 2023
Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification
unified-pre-training-with-pseudo-texts-for
EFLNet: Enhancing Feature Learning for Infrared Small Target Detection
arXiv 2023
"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation
arXiv 2023
Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages
arXiv 2022
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
NAACL 2022 7
Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking
arXiv 2021
Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need
arXiv 2021
Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval
EMNLP (MRL) 2021 11
Affiliations
Frequent co-authors
10from 42 papers