Yuyin Zhou
- Papers
- 33
Cite
Notes
Only stored in your browser.
Authored papers
33AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
arXiv 2026
ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning
arXiv 2026
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
arXiv 2026
From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
arXiv 2026
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents
arXiv 2026
Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development
arXiv 2026
VecGlypher: Unified Vector Glyph Generation with Language Models
arXiv 2026
VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
arXiv 2026
Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows
arXiv 2026
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
arXiv 2026
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
arXiv 2025
$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark
arXiv 2025
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
arXiv 2025
Discrete Diffusion Models with MLLMs for Unified Medical Multimodal Generation
arXiv 2025
Exploring the Vulnerabilities of Federated Learning: A Deep Dive into Gradient Inversion Attacks
arXiv 2025
m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
arXiv 2025
A Survey on Latent Reasoning
arXiv 2025
AHELM: A Holistic Evaluation of Audio-Language Models
arXiv 2025
MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs
arXiv 2025
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning
arXiv 2025
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
arXiv 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
arXiv 2025
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization
arXiv 2024
A New Federated Learning Framework Against Gradient Inversion Attacks
arXiv 2024
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
arXiv 2024
What If We Recaption Billions of Web Images with LLaMA-3?
arXiv 2024
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
arXiv 2024
Tackling Data Heterogeneity in Federated Learning via Loss Decomposition
arXiv 2024
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
arXiv 2023
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
arXiv 2022
Masked Autoencoders Enable Efficient Knowledge Distillers
CVPR 2023 1
Unleashing the Power of Visual Prompting At the Pixel Level
arXiv 2022
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
arXiv 2021
Affiliations
Frequent co-authors
10from 33 papers