0

Yu Su

Papers
46

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
46papers

Authored papers

46

QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

arXiv 2026

2026

Automatic Image-Level Morphological Trait Annotation for Organismal Images

arXiv 2026

2026

When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents

arXiv 2026

2026

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

arXiv 2025

2025

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

arXiv 2025

2025

An Illusion of Progress? Assessing the Current State of Web Agents

arXiv 2025

2025

ARM: Adaptive Reasoning Model

arXiv 2025

2025

Prompt-CAM: A Simpler Interpretable Transformer for Fine-Grained Analysis

arXiv 2025

2025

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

arXiv 2025

2025

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

arXiv 2025

2025

BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning

arXiv 2025

2025

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

arXiv 2025

2025

MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools

arXiv 2025

2025

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

arXiv 2025

2025

Is Extending Modality The Right Path Towards Omni-Modality?

arXiv 2025

2025

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

arXiv 2025

2025

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

arXiv 2024

2024

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

arXiv 2024

2024

RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics

CVPR 2025 1

2024

GPT-4V(ision) is a Generalist Web Agent, if Grounded

arXiv 2024

2024

LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error

arXiv 2024

2024

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

arXiv 2024

2024

When is Tree Search Useful for LLM Planning? It Depends on the Discriminator

arXiv 2024

2024

Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

arXiv 2024

2024

VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images

arXiv 2024

2024

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

arXiv 2024

2024

Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning

arXiv 2024

2024

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

arXiv 2024

2024

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

arXiv 2024

2024

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

arXiv 2024

2024

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

arXiv 2024

2024

Mind2Web: Towards a Generalist Agent for the Web

mind2web-towards-a-generalist-agent-for-the

2023

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

CVPR 2024 1

2023

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

arXiv 2023

2023

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis

arXiv 2023

2023

Automatic Evaluation of Attribution by Large Language Models

arXiv 2023

2023

Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors

arXiv 2023

2023

Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs

arXiv 2023

2023

AgentBench: Evaluating LLMs as Agents

arXiv 2023

2023

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

NeurIPS 2023 11

2023

BioCLIP: A Vision Foundation Model for the Tree of Life

CVPR 2024 1

2023

Biomedical Language Models are Robust to Sub-optimal Tokenization

arXiv 2023

2023

Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

arXiv 2022

2022

A Retrieve-and-Read Framework for Knowledge Graph Link Prediction

arXiv 2022

2022

A Systematic Investigation of KB-Text Embedding Alignment at Scale

ACL 2021 5

2021

Logical Natural Language Generation from Open-Domain Tables

logical-natural-language-generation-from-open-1

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 46 papers