0

Yanfeng Wang

Papers
59

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
59papers

Authored papers

59

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

arXiv 2026

2026

EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

arXiv 2026

2026

Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach

arXiv 2026

2026

AgentEHR: Advancing Autonomous Clinical Decision-Making via Retrospective Summarization

arXiv 2026

2026

SpotSound: Enhancing Large Audio-Language Models with Fine-Grained Temporal Grounding

arXiv 2026

2026

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

arXiv 2026

2026

Multi-Agent System for Comprehensive Soccer Understanding

arXiv 2025

2025

ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification

arXiv 2025

2025

SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding

arXiv 2025

2025

One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution

arXiv 2025

2025

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

arXiv 2025

2025

Evolving Diagnostic Agents in a Virtual Clinical Environment

arXiv 2025

2025

AWorld: Orchestrating the Training Recipe for Agentic AI

arXiv 2025

2025

SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam?

arXiv 2025

2025

EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

arXiv 2025

2025

End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning

arXiv 2025

2025

Rethinking Whole-Body CT Image Interpretation: An Abnormality-Centric Approach

arXiv 2025

2025

RARE: Retrieval-Augmented Reasoning Modeling

arXiv 2025

2025

VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality Generation

arXiv 2025

2025

MedS$^3$: Towards Medical Small Language Models with Self-Evolved Slow Thinking

arXiv 2025

2025

WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages

arXiv 2025

2025

FedMABench: Benchmarking Mobile Agents on Decentralized Heterogeneous User Data

arXiv 2025

2025

VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models

arXiv 2025

2025

MatchTime: Towards Automatic Soccer Game Commentary Generation

arXiv 2024

2024

RaTEScore: A Metric for Radiology Report Generation

arXiv 2024

2024

Towards Universal Soccer Video Understanding

CVPR 2025 1

2024

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

CVPR 2025 1

2024

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

CVPR 2024 1

2024

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

arXiv 2024

2024

An Extensible Framework for Open Heterogeneous Collaborative Perception

arXiv 2024

2024

Towards Evaluating and Building Versatile Large Language Models for Medicine

arXiv 2024

2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation

arXiv 2024

2024

Underwater Camouflaged Object Tracking Meets Vision-Language SAM2

arXiv 2024

2024

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts

arXiv 2024

2024

A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis

arXiv 2024

2024

Towards Building Multilingual Language Model for Medicine

arXiv 2024

2024

ReMamber: Referring Image Segmentation with Mamba Twister

arXiv 2024

2024

MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities

ICCV 2025

2024

Low-Rank Knowledge Decomposition for Medical Foundation Models

CVPR 2024 1

2024

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

arXiv 2024

2024

CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios

arXiv 2024

2024

Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models

arXiv 2024

2024

MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception

arXiv 2024

2024

Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data

arXiv 2023

2023

PMC-LLaMA: Towards Building Open-source Language Models for Medicine

arXiv 2023

2023

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

arXiv 2023

2023

One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts

arXiv 2023

2023

DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

CVPR 2023 1

2023

FedDisco: Federated Learning with Discrepancy-Aware Collaboration

arXiv 2023

2023

Zero-shot Composed Text-Image Retrieval

arXiv 2023

2023

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation

open-vocabulary-semantic-segmentation-via

2023

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

arXiv 2023

2023

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents

arXiv 2023

2023

LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language Models

arXiv 2023

2023

Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction

ICCV 2023 1

2023

Joint-Relation Transformer for Multi-Person Motion Prediction

ICCV 2023 1

2023

Boost Video Frame Interpolation via Motion Adaptation

arXiv 2023

2023

K-Space Transformer for Undersampled MRI Reconstruction

arXiv 2022

2022

Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 59 papers