0

Yan Zhang

Papers
39

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
39papers

Authored papers

39

SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models

arXiv 2026

2026

UniVBench: Towards Unified Evaluation for Video Foundation Models

arXiv 2026

2026

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

arXiv 2025

2025

Baichuan-M1: Pushing the Medical Capability of Large Language Models

arXiv 2025

2025

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

arXiv 2025

2025

MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning

arXiv 2025

2025

Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray

arXiv 2025

2025

VidText: Towards Comprehensive Evaluation for Video Text Understanding

arXiv 2025

2025

Generating $π$-Functional Molecules Using STGG+ with Active Learning

arXiv 2025

2025

Baichuan-Omni-1.5 Technical Report

arXiv 2025

2025

UniFit: Towards Universal Virtual Try-on with MLLM-Guided Semantic Alignment

arXiv 2025

2025

MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks

arXiv 2025

2025

Graph Retrieval-Augmented Generation: A Survey

arXiv 2024

2024

Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal

arXiv 2024

2024

CFBench: A Comprehensive Constraints-Following Benchmark for LLMs

arXiv 2024

2024

Retrieval Augmented Instruction Tuning for Open NER with Large Language Models

arXiv 2024

2024

Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models

arXiv 2024

2024

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

arXiv 2024

2024

SysBench: Can Large Language Models Follow System Messages?

arXiv 2024

2024

Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level

arXiv 2024

2024

MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts

arXiv 2024

2024

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

CVPR 2024 1

2024

UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment

arXiv 2024

2024

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

arXiv 2024

2024

M-MAD: Multidimensional Multi-Agent Debate Framework for Fine-grained Machine Translation Evaluation

arXiv 2024

2024

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark

arXiv 2024

2024

DCR: Divide-and-Conquer Reasoning for Multi-choice Question Answering with LLMs

arXiv 2024

2024

Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views

ICCV 2023 1

2023

Is ChatGPT a Good Recommender? A Preliminary Study

arXiv 2023

2023

Object-centric architectures enable efficient causal representation learning

arXiv 2023

2023

MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples

arXiv 2023

2023

Allies: Prompting Large Language Model with Beam Search

arXiv 2023

2023

Improving Large Language Models in Event Relation Logical Prediction

arXiv 2023

2023

Unlocking Slot Attention by Changing Optimal Transport Costs

arXiv 2023

2023

A Conversation is Worth A Thousand Recommendations: A Survey of Holistic Conversational Recommender Systems

arXiv 2023

2023

T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing

arXiv 2023

2023

IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks

ACL 2022 5

2022

Pay More Attention to History: A Context Modelling Strategy for Conversational Text-to-SQL

arXiv 2021

2021

ENT-DESC: Entity Description Generation by Exploring Knowledge Graph

EMNLP 2020 11

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 39 papers