0

Huan Sun

Papers
37

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
37papers

Authored papers

37

QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

arXiv 2026

2026

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

arXiv 2026

2026

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

arXiv 2026

2026

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents

arXiv 2026

2026

Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation

arXiv 2026

2026

When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents

arXiv 2026

2026

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

arXiv 2025

2025

An Illusion of Progress? Assessing the Current State of Web Agents

arXiv 2025

2025

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

arXiv 2025

2025

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

arXiv 2025

2025

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

arXiv 2025

2025

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

arXiv 2025

2025

Beyond Clicking:A Step Towards Generalist GUI Grounding via Text Dragging

arXiv 2025

2025

Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure

arXiv 2025

2025

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

arXiv 2024

2024

AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs

arXiv 2024

2024

GPT-4V(ision) is a Generalist Web Agent, if Grounded

arXiv 2024

2024

ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving

arXiv 2024

2024

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

arXiv 2024

2024

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

arXiv 2024

2024

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

arXiv 2024

2024

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

arXiv 2024

2024

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

arXiv 2024

2024

eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data

arXiv 2024

2024

When is Tree Search Useful for LLM Planning? It Depends on the Discriminator

arXiv 2024

2024

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

arXiv 2024

2024

AttributionBench: How Hard is Automatic Attribution Evaluation?

arXiv 2024

2024

Mind2Web: Towards a Generalist Agent for the Web

mind2web-towards-a-generalist-agent-for-the

2023

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

CVPR 2024 1

2023

AgentBench: Evaluating LLMs as Agents

arXiv 2023

2023

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

NeurIPS 2023 11

2023

Biomedical Language Models are Robust to Sub-optimal Tokenization

arXiv 2023

2023

Automatic Evaluation of Attribution by Large Language Models

arXiv 2023

2023

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

arXiv 2022

2022

Iteratively Prompt Pre-trained Language Models for Chain of Thought

arXiv 2022

2022

TURL: Table Understanding through Representation Learning

arXiv 2020

2020

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

arXiv 2018

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 37 papers