Yichi Zhang
- Papers
- 36
Cite
Notes
Only stored in your browser.
Authored papers
36The Python Simulations of Chemistry Framework: 10 years of an open-source quantum chemistry project
arXiv 2026
AcademiClaw: When Students Set Challenges for AI Agents
arXiv 2026
Kimi K2.5: Visual Agentic Intelligence
arXiv 2026
HippoCamp: Benchmarking Contextual Agents on Personal Computers
arXiv 2026
Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies
arXiv 2025
OntoTune: Ontology-Driven Self-training for Aligning Large Language Models
arXiv 2025
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors
arXiv 2025
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
arXiv 2025
STAIR: Improving Safety Alignment with Introspective Reasoning
arXiv 2025
Towards Hierarchical Rectified Flow
arXiv 2025
Improve Representation for Imbalanced Regression through Geometric Constraints
CVPR 2025 1
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures
arXiv 2025
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
arXiv 2024
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
arXiv 2024
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
arXiv 2024
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
arXiv 2024
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
arXiv 2024
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
arXiv 2024
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
arXiv 2024
Autonomous Evaluation and Refinement of Digital Agents
arXiv 2024
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
arXiv 2024
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code
arXiv 2023
PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs
arXiv 2023
Making Large Language Models Perform Better in Knowledge Graph Completion
arXiv 2023
Diffusion Noise Feature: Accurate and Fast Generated Image Detection
arXiv 2023
MACO: A Modality Adversarial and Contrastive Framework for Modality-missing Multi-modal Knowledge Graph Completion
arXiv 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
arXiv 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
arXiv 2023
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?
arXiv 2023
Tele-Knowledge Pre-training for Fault Analysis
arXiv 2022
MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid
arXiv 2022
DANLI: Deliberative Agent for Following Natural Language Instructions
arXiv 2022
Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring
Findings (ACL) 2021 8
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning
EMNLP 2020 11
Affiliations
Frequent co-authors
10from 36 papers