0

Xin Wang

Papers
50

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
50papers

Authored papers

50

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

arXiv 2026

2026

AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer

arXiv 2026

2026

AIDABench: AI Data Analytics Benchmark

arXiv 2026

2026

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

arXiv 2026

2026

ShowUI-Aloha: Human-Taught GUI Agent

arXiv 2026

2026

QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models

arXiv 2026

2026

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

arXiv 2026

2026

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

arXiv 2025

2025

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs

arXiv 2025

2025

Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning

arXiv 2025

2025

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

arXiv 2025

2025

Safety at Scale: A Comprehensive Survey of Large Model Safety

arXiv 2025

2025

Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

CVPR 2025 1

2025

RLMiniStyler: Light-weight RL Style Agent for Arbitrary Sequential Neural Style Generation

arXiv 2025

2025

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

arXiv 2025

2025

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

arXiv 2025

2025

BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models

arXiv 2025

2025

ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

arXiv 2025

2025

Technical Report of TeleChat2, TeleChat2.5 and T1

arXiv 2025

2025

Post-training for Deepfake Speech Detection

arXiv 2025

2025

SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

arXiv 2025

2025

Robust AI-Generated Face Detection with Imbalanced Data

arXiv 2025

2025

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

arXiv 2024

2024

VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding

arXiv 2024

2024

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

arXiv 2024

2024

When Do We Not Need Larger Vision Models?

arXiv 2024

2024

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

arXiv 2024

2024

AUITestAgent: Automatic Requirements Oriented GUI Function Testing

arXiv 2024

2024

BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems

arXiv 2024

2024

MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations

arXiv 2024

2024

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models

arXiv 2024

2024

UU-Mamba: Uncertainty-aware U-Mamba for Cardiovascular Segmentation

arXiv 2024

2024

A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

arXiv 2024

2024

ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

arXiv 2024

2024

Towards Multi-Source Retrieval-Augmented Generation via Synergizing Reasoning and Preference-Driven Retrieval

arXiv 2024

2024

NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls

arXiv 2024

2024

Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles

arXiv 2024

2024

Texture, Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection

arXiv 2024

2024

MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation

arXiv 2024

2024

Efficient Large Language Models: A Survey

arXiv 2023

2023

DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation

arXiv 2023

2023

Understanding Zero-Shot Adversarial Robustness for Large-Scale Models

arXiv 2022

2022

Doubly Right Object Recognition: A Why Prompt for Visual Rationales

CVPR 2023 1

2022

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

arXiv 2022

2022

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

arXiv 2021

2021

Automated Machine Learning on Graphs: A Survey

arXiv 2021

2021

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

arXiv 2020

2020

Self-Supervised Learning for Contextualized Extractive Summarization

self-supervised-learning-for-contextualized-1

2019

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

bdd100k-a-diverse-driving-dataset-for

2018

SkipNet: Learning Dynamic Routing in Convolutional Networks

skipnet-learning-dynamic-routing-in-1

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 50 papers