0

Ruiyi Zhang

Papers
16

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
16papers

Authored papers

16

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

arXiv 2026

2026

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

arXiv 2025

2025

GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding

arXiv 2025

2025

MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models

arXiv 2025

2025

Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models

arXiv 2025

2025

VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding

arXiv 2025

2025

TextLap: Customizing Language Models for Text-to-Layout Planning

arXiv 2024

2024

Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints

arXiv 2024

2024

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models

arXiv 2024

2024

DynaSaur: Large Language Agents Beyond Predefined Actions

arXiv 2024

2024

Towards Building the Federated GPT: Federated Instruction Tuning

arXiv 2023

2023

Bias and Fairness in Large Language Models: A Survey

arXiv 2023

2023

Learning Navigational Visual Representations with Semantic Map Supervision

ICCV 2023 1

2023

Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach

arXiv 2023

2023

LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding

arXiv 2023

2023

LAFITE: Towards Language-Free Training for Text-to-Image Generation

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 16 papers