Zhangyue Yin
- Papers
- 17
Cite
Notes
Only stored in your browser.
Authored papers
17CL-bench: A Benchmark for Context Learning
arXiv 2026
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning
arXiv 2026
LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening
arXiv 2026
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment
arXiv 2026
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions
arXiv 2026
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
arXiv 2025
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
arXiv 2025
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
arXiv 2025
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond
arXiv 2024
Can AI Assistants Know What They Don't Know?
arXiv 2024
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
arXiv 2024
A Survey of Reasoning with Foundation Models
arXiv 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
arXiv 2023
Evaluating Hallucinations in Chinese Large Language Models
arXiv 2023
Do Large Language Models Know What They Don't Know?
arXiv 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
arXiv 2023
Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication
arXiv 2023
Affiliations
Frequent co-authors
10from 17 papers