Hao Peng
- Papers
- 51
Cite
Notes
Only stored in your browser.
Authored papers
51GLM-5: from Vibe Coding to Agentic Engineering
arXiv 2026
SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety
arXiv 2026
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning
arXiv 2026
WildReward: Learning Reward Models from In-the-Wild Human Interactions
arXiv 2026
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
arXiv 2025
Adaptation of Agentic AI
arXiv 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
arXiv 2025
Process Reward Models That Think
arXiv 2025
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
arXiv 2025
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
arXiv 2025
Kwai Keye-VL 1.5 Technical Report
arXiv 2025
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones
arXiv 2025
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
arXiv 2025
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
arXiv 2025
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
arXiv 2025
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
arXiv 2024
Data Engineering for Scaling Language Models to 128K Context
arXiv 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
arXiv 2024
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
arXiv 2024
Advancing LLM Reasoning Generalists with Preference Trees
arXiv 2024
Retrieval Head Mechanistically Explains Long-Context Factuality
arXiv 2024
Free Process Rewards without Process Labels
arXiv 2024
SOLO: A Single Transformer for Scalable Vision-Language Modeling
arXiv 2024
Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
arXiv 2024
DAMe: Personalized Federated Social Event Detection with Dual Aggregation Mechanism
arXiv 2024
Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space
arXiv 2024
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
arXiv 2024
ADELIE: Aligning Large Language Models on Information Extraction
arXiv 2024
CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification
arXiv 2024
Source-Aware Training Enables Knowledge Attribution in Language Models
arXiv 2024
Eliminating Position Bias of Language Models: A Mechanistic Approach
arXiv 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
arXiv 2024
IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
arXiv 2024
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
arXiv 2023
Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
arXiv 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
arXiv 2023
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
arXiv 2023
Specializing Smaller Language Models towards Multi-Step Reasoning
arXiv 2023
MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation
arXiv 2023
FiLM: Fill-in Language Models for Any-Order Generation
arXiv 2023
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
arXiv 2023
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
arXiv 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
arXiv 2023
Modeling Context With Linear Attention for Scalable Document-Level Translation
arXiv 2022
Transparency Helps Reveal When Language Models Learn Meaning
arXiv 2022
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
arXiv 2022
Tailor: Generating and Perturbing Text with Semantic Controls
ACL 2022 5
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning
arXiv 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
deep-encoder-shallow-decoder-reevaluating-non
Affiliations
Frequent co-authors
10from 51 papers