Ruiyi Zhang
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models
arXiv 2026
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
arXiv 2025
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
arXiv 2025
MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models
arXiv 2025
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models
arXiv 2025
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding
arXiv 2025
TextLap: Customizing Language Models for Text-to-Layout Planning
arXiv 2024
Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints
arXiv 2024
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
arXiv 2024
DynaSaur: Large Language Agents Beyond Predefined Actions
arXiv 2024
Towards Building the Federated GPT: Federated Instruction Tuning
arXiv 2023
Bias and Fairness in Large Language Models: A Survey
arXiv 2023
Learning Navigational Visual Representations with Semantic Map Supervision
ICCV 2023 1
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach
arXiv 2023
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding
arXiv 2023
LAFITE: Towards Language-Free Training for Text-to-Image Generation
arXiv 2021
Affiliations
Frequent co-authors
10from 16 papers