Jian Wang
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34HandX: Scaling Bimanual Motion and Interaction Generation
arXiv 2026
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models
arXiv 2026
ClinAlign: Scaling Healthcare Alignment from Clinician Preference
arXiv 2026
LiveClin: A Live Clinical Benchmark without Leakage
arXiv 2026
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
arXiv 2025
Seed1.5-VL Technical Report
arXiv 2025
SPA-RL: Reinforcing LLM Agents via Stepwise Progress Attribution
arXiv 2025
Evolving Diagnostic Agents in a Virtual Clinical Environment
arXiv 2025
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning
arXiv 2025
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
arXiv 2025
Latent Chain-of-Thought for Visual Reasoning
arXiv 2025
4KAgent: Agentic Any Image to 4K Super-Resolution
arXiv 2025
EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis
arXiv 2025
STeCa: Step-level Trajectory Calibration for LLM Agent Learning
arXiv 2025
KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception
CVPR 2025 1
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors
arXiv 2025
RobustSAM: Segment Anything Robustly on Degraded Images
robustsam-segment-anything-robustly-on
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
arXiv 2024
ControlNeXt: Powerful and Efficient Control for Image and Video Generation
arXiv 2024
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
arXiv 2024
SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding
arXiv 2024
POA: Pre-training Once for Models of All Sizes
arXiv 2024
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue
arXiv 2024
Self-Detoxifying Language Models via Toxification Reversal
arXiv 2023
HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
hap-structure-aware-masked-image-modeling-for
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation
arXiv 2023
Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
ICCV 2023 1
Taiyi: A Bilingual Fine-Tuned Large Language Model for Diverse Biomedical Tasks
arXiv 2023
Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification
unified-pre-training-with-pseudo-texts-for
Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue
arXiv 2023
Medical Dialogue Generation via Dual Flow Modeling
arXiv 2023
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
ICCV 2023 1
Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems
arXiv 2022
Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering
arXiv 2019
Affiliations
Frequent co-authors
10from 34 papers