0

Peng Xia

Papers
24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
24papers

Authored papers

24

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

arXiv 2026

2026

SimpleMem: Efficient Lifelong Memory for LLM Agents

arXiv 2026

2026

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

arXiv 2026

2026

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

arXiv 2026

2026

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

arXiv 2026

2026

ClawArena: Benchmarking AI Agents in Evolving Information Environments

arXiv 2026

2026

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

arXiv 2025

2025

ChemMLLM: Chemical Multimodal Large Language Model

arXiv 2025

2025

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

arXiv 2025

2025

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

arXiv 2025

2025

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

arXiv 2025

2025

Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch

arXiv 2025

2025

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

arXiv 2025

2025

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

arXiv 2025

2025

Multiplayer Nash Preference Optimization

arXiv 2025

2025

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

arXiv 2024

2024

MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models

arXiv 2024

2024

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization

arXiv 2024

2024

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

arXiv 2024

2024

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

arXiv 2024

2024

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

arXiv 2024

2024

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

arXiv 2024

2024

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

arXiv 2023

2023

HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding

hgclip-exploring-vision-language-models-with

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 24 papers