0

Shuai Wang

Papers
39

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
39papers

Authored papers

39

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

arXiv 2026

2026

Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

arXiv 2026

2026

Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making

arXiv 2026

2026

DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

arXiv 2026

2026

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

arXiv 2026

2026

DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

arXiv 2025

2025

LeVo: High-Quality Song Generation with Multi-Preference Alignment

arXiv 2025

2025

SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

arXiv 2025

2025

DDT: Decoupled Diffusion Transformer

ddt-decoupled-diffusion-transformer

2025

OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale

arXiv 2025

2025

SoK: Evaluating Jailbreak Guardrails for Large Language Models

arXiv 2025

2025

ACEBench: Who Wins the Match Point in Tool Learning?

arXiv 2025

2025

SocialEval: Evaluating Social Intelligence of Large Language Models

arXiv 2025

2025

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

arXiv 2025

2025

Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment

arXiv 2025

2025

ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

arXiv 2025

2025

PixNerd: Pixel Neural Field Diffusion

arXiv 2025

2025

LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing

arXiv 2025

2025

Differentiable Solver Search for Fast Diffusion Sampling

arXiv 2025

2025

HiconAgent: History Context-aware Policy Optimization for GUI Agents

arXiv 2025

2025

Advances in Speech Separation: Techniques, Challenges, and Future Trends

arXiv 2025

2025

MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation

arXiv 2025

2025

GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents

arXiv 2025

2025

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

arXiv 2025

2025

CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning

arXiv 2025

2025

NAMET: Robust Massive Model Editing via Noise-Aware Memory Optimization

arXiv 2025

2025

DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction

arXiv 2025

2025

Symbolic Learning Enables Self-Evolving Agents

arXiv 2024

2024

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

arXiv 2024

2024

AI PERSONA: Towards Life-long Personalization of LLMs

arXiv 2024

2024

StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs

arXiv 2024

2024

WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

arXiv 2024

2024

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

arXiv 2024

2024

Starbucks: Improved Training for 2D Matryoshka Embeddings

arXiv 2024

2024

Tackling Data Heterogeneity in Federated Learning via Loss Decomposition

arXiv 2024

2024

Deep Equilibrium Object Detection

ICCV 2023 1

2023

Parsing is All You Need for Accurate Gait Recognition in the Wild

arXiv 2023

2023

Towards Open-Vocabulary Video Instance Segmentation

ICCV 2023 1

2023

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 39 papers