0

Hao Peng

Papers
51

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
51papers

Authored papers

51

GLM-5: from Vibe Coding to Agentic Engineering

arXiv 2026

2026

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

arXiv 2026

2026

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

arXiv 2026

2026

WildReward: Learning Reward Models from In-the-Wild Human Interactions

arXiv 2026

2026

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

arXiv 2025

2025

Adaptation of Agentic AI

arXiv 2025

2025

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

arXiv 2025

2025

Process Reinforcement through Implicit Rewards

arXiv 2025

2025

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

arXiv 2025

2025

Process Reward Models That Think

arXiv 2025

2025

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

arXiv 2025

2025

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

arXiv 2025

2025

Kwai Keye-VL 1.5 Technical Report

arXiv 2025

2025

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

arXiv 2025

2025

The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

arXiv 2025

2025

Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training

arXiv 2025

2025

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

arXiv 2025

2025

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

arXiv 2024

2024

Data Engineering for Scaling Language Models to 128K Context

arXiv 2024

2024

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

arXiv 2024

2024

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

arXiv 2024

2024

Advancing LLM Reasoning Generalists with Preference Trees

arXiv 2024

2024

Retrieval Head Mechanistically Explains Long-Context Factuality

arXiv 2024

2024

Free Process Rewards without Process Labels

arXiv 2024

2024

SOLO: A Single Transformer for Scalable Vision-Language Modeling

arXiv 2024

2024

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

arXiv 2024

2024

DAMe: Personalized Federated Social Event Detection with Dual Aggregation Mechanism

arXiv 2024

2024

Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space

arXiv 2024

2024

Hyperbolic Geometric Latent Diffusion Model for Graph Generation

arXiv 2024

2024

ADELIE: Aligning Large Language Models on Information Extraction

arXiv 2024

2024

CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification

arXiv 2024

2024

Source-Aware Training Enables Knowledge Attribution in Language Models

arXiv 2024

2024

Eliminating Position Bias of Language Models: A Mechanistic Approach

arXiv 2024

2024

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

arXiv 2024

2024

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

arXiv 2024

2024

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

arXiv 2023

2023

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

arXiv 2023

2023

LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models

arXiv 2023

2023

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

arXiv 2023

2023

Specializing Smaller Language Models towards Multi-Step Reasoning

arXiv 2023

2023

MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation

arXiv 2023

2023

FiLM: Fill-in Language Models for Any-Order Generation

arXiv 2023

2023

TRAM: Bridging Trust Regions and Sharpness Aware Minimization

arXiv 2023

2023

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

arXiv 2023

2023

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

arXiv 2023

2023

Modeling Context With Linear Attention for Scalable Document-Level Translation

arXiv 2022

2022

Transparency Helps Reveal When Language Models Learn Meaning

arXiv 2022

2022

COPEN: Probing Conceptual Knowledge in Pre-trained Language Models

arXiv 2022

2022

Tailor: Generating and Perturbing Text with Semantic Controls

ACL 2022 5

2021

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

arXiv 2020

2020

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

deep-encoder-shallow-decoder-reevaluating-non

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 51 papers