Ming Zhang
- Papers
- 32
Cite
Notes
Only stored in your browser.
Authored papers
32AI Can Learn Scientific Taste
arXiv 2026
CL-bench: A Benchmark for Context Learning
arXiv 2026
Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control
arXiv 2026
SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents
arXiv 2026
LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening
arXiv 2026
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment
arXiv 2026
Can Deep Research Agents Find and Organize? Evaluating the Synthesis Gap with Expert Taxonomies
arXiv 2026
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
arXiv 2025
Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning
arXiv 2025
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
arXiv 2025
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
arXiv 2025
PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts
arXiv 2025
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
arXiv 2025
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
arXiv 2025
From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents
arXiv 2025
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
arXiv 2024
MouSi: Poly-Visual-Expert Vision-Language Models
arXiv 2024
Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning
arXiv 2024
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
arXiv 2024
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
arXiv 2024
TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities
arXiv 2024
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
arXiv 2024
Preparing Lessons for Progressive Training on Language Models
arXiv 2024
Measuring Vision-Language STEM Skills of Neural Models
arXiv 2024
A Survey of Reasoning with Foundation Models
arXiv 2023
RJUA-QA: A Comprehensive QA Dataset for Urology
arXiv 2023
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
arXiv 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
arXiv 2023
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models
arXiv 2023
Partial FC: Training 10 Million Identities on a Single Machine
arXiv 2020
AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
arXiv 2018
LINE: Large-scale Information Network Embedding
arXiv 2015
Affiliations
Frequent co-authors
10from 32 papers