0

Cihang Xie

Papers
34

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
34papers

Authored papers

34

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

arXiv 2026

2026

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning

arXiv 2026

2026

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

arXiv 2026

2026

SimpleMem: Efficient Lifelong Memory for LLM Agents

arXiv 2026

2026

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

arXiv 2026

2026

From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models

arXiv 2026

2026

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

arXiv 2026

2026

In-Context Reinforcement Learning for Tool Use in Large Language Models

arXiv 2026

2026

Target-Oriented Pretraining Data Selection via Neuron-Activated Graph

arXiv 2026

2026

ClawArena: Benchmarking AI Agents in Evolving Information Environments

arXiv 2026

2026

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

arXiv 2026

2026

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

arXiv 2026

2026

VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation

arXiv 2026

2026

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

ICCV 2025

2025

$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

arXiv 2025

2025

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

arXiv 2025

2025

AHELM: A Holistic Evaluation of Audio-Language Models

arXiv 2025

2025

Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails

arXiv 2025

2025

Safety at Scale: A Comprehensive Survey of Large Model Safety

arXiv 2025

2025

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

arXiv 2025

2025

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

arXiv 2025

2025

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

arXiv 2025

2025

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

arXiv 2024

2024

What If We Recaption Billions of Web Images with LLaMA-3?

arXiv 2024

2024

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

arXiv 2024

2024

Autoregressive Pretraining with Mamba in Vision

arXiv 2024

2024

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

arXiv 2024

2024

CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a \$10,000 Budget; An Extra \$4,000 Unlocks 81.8% Accuracy

arXiv 2023

2023

Rejuvenating image-GPT as Strong Visual Representation Learners

arXiv 2023

2023

How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs

arXiv 2023

2023

Unleashing the Power of Visual Prompting At the Pixel Level

arXiv 2022

2022

Masked Autoencoders Enable Efficient Knowledge Distillers

CVPR 2023 1

2022

iBOT: Image BERT Pre-Training with Online Tokenizer

arXiv 2021

2021

Adversarial Attacks and Defences Competition

arXiv 2018

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 34 papers