0

Xinyu Zhang

Papers
42

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
42papers

Authored papers

42

Qwen3-TTS Technical Report

arXiv 2026

2026

Qwen3-ASR Technical Report

arXiv 2026

2026

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

arXiv 2026

2026

Learning Visual Feature-Based World Models via Residual Latent Action

arXiv 2026

2026

LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

arXiv 2026

2026

Spectral Condition for μP under Width-Depth Scaling

arXiv 2026

2026

Qwen3-Omni Technical Report

arXiv 2025

2025

Qwen3 Technical Report

preprint

2025

MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving

arXiv 2025

2025

SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL

arXiv 2025

2025

Scaling Diffusion Transformers Efficiently via $μ$P

arXiv 2025

2025

Motion Blender Gaussian Splatting for Dynamic Scene Reconstruction

arXiv 2025

2025

Beyond the Surface: Measuring Self-Preference in LLM Judgments

arXiv 2025

2025

Qwen3Guard Technical Report

arXiv 2025

2025

Co-MTP: A Cooperative Trajectory Prediction Framework with Multi-Temporal Fusion for Autonomous Driving

arXiv 2025

2025

Qwen2 Technical Report

arXiv 2024

2024

V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception

arXiv 2024

2024

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

arXiv 2024

2024

FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

arXiv 2024

2024

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

arXiv 2024

2024

Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

arXiv 2024

2024

Security Attacks on LLM-based Code Completion Tools

arXiv 2024

2024

EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding

arXiv 2024

2024

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

arXiv 2024

2024

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers

arXiv 2024

2024

Backdoor Contrastive Learning via Bi-level Trigger Optimization

arXiv 2024

2024

Detect Everything with Few Examples

arXiv 2023

2023

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

hap-structure-aware-masked-image-modeling-for

2023

Lenna: Language Enhanced Reasoning Detection Assistant

arXiv 2023

2023

Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face

arXiv 2023

2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

arXiv 2023

2023

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

arXiv 2023

2023

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

arXiv 2023

2023

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

arXiv 2023

2023

Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification

unified-pre-training-with-pseudo-texts-for

2023

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

arXiv 2023

2023

"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation

arXiv 2023

2023

Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages

arXiv 2022

2022

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

NAACL 2022 7

2021

Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking

arXiv 2021

2021

Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need

arXiv 2021

2021

Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval

EMNLP (MRL) 2021 11

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 42 papers