0

Yu Gu

Papers
21

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
21papers

Authored papers

21

Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains

arXiv 2026

2026

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

arXiv 2025

2025

Magma: A Foundation Model for Multimodal AI Agents

CVPR 2025 1

2025

Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts

arXiv 2025

2025

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

arXiv 2025

2025

X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains

arXiv 2025

2025

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

arXiv 2025

2025

RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts

arXiv 2025

2025

LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences

arXiv 2025

2025

HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization

arXiv 2025

2025

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

arXiv 2024

2024

Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation

arXiv 2024

2024

STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment

arXiv 2024

2024

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

arXiv 2024

2024

Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

arXiv 2024

2024

Mind2Web: Towards a Generalist Agent for the Web

mind2web-towards-a-generalist-agent-for-the

2023

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

arXiv 2023

2023

Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs

arXiv 2023

2023

AgentBench: Evaluating LLMs as Agents

arXiv 2023

2023

Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

arXiv 2022

2022

A Systematic Investigation of KB-Text Embedding Alignment at Scale

ACL 2021 5

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 21 papers