0

Yichi Zhang

Papers
36

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
36papers

Authored papers

36

The Python Simulations of Chemistry Framework: 10 years of an open-source quantum chemistry project

arXiv 2026

2026

AcademiClaw: When Students Set Challenges for AI Agents

arXiv 2026

2026

Kimi K2.5: Visual Agentic Intelligence

arXiv 2026

2026

HippoCamp: Benchmarking Contextual Agents on Personal Computers

arXiv 2026

2026

Proactive Assistant Dialogue Generation from Streaming Egocentric Videos

arXiv 2025

2025

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

arXiv 2025

2025

AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies

arXiv 2025

2025

OntoTune: Ontology-Driven Self-training for Aligning Large Language Models

arXiv 2025

2025

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors

arXiv 2025

2025

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

arXiv 2025

2025

STAIR: Improving Safety Alignment with Introspective Reasoning

arXiv 2025

2025

Towards Hierarchical Rectified Flow

arXiv 2025

2025

Improve Representation for Imbalanced Regression through Geometric Constraints

CVPR 2025 1

2025

Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures

arXiv 2025

2025

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

arXiv 2024

2024

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

arXiv 2024

2024

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain

arXiv 2024

2024

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation

arXiv 2024

2024

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

arXiv 2024

2024

PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

arXiv 2024

2024

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

arXiv 2024

2024

Autonomous Evaluation and Refinement of Digital Agents

arXiv 2024

2024

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

arXiv 2024

2024

ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code

arXiv 2023

2023

PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs

arXiv 2023

2023

Making Large Language Models Perform Better in Knowledge Graph Completion

arXiv 2023

2023

Diffusion Noise Feature: Accurate and Fast Generated Image Detection

arXiv 2023

2023

MACO: A Modality Adversarial and Contrastive Framework for Modality-missing Multi-modal Knowledge Graph Completion

arXiv 2023

2023

Rethinking Model Ensemble in Transfer-based Adversarial Attacks

arXiv 2023

2023

Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

arXiv 2023

2023

Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

arXiv 2023

2023

Tele-Knowledge Pre-training for Fault Analysis

arXiv 2022

2022

MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid

arXiv 2022

2022

DANLI: Deliberative Agent for Following Natural Language Instructions

arXiv 2022

2022

Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring

Findings (ACL) 2021 8

2021

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

EMNLP 2020 11

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 36 papers