Peng Li
- Papers
- 45
Cite
Notes
Only stored in your browser.
Authored papers
45GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
arXiv 2026
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks
arXiv 2026
UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass
arXiv 2026
GigaWorld-Policy: An Efficient Action-Centered World--Action Model
arXiv 2026
Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models
arXiv 2026
FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions
arXiv 2026
Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors
arXiv 2026
U-Net-Like Spiking Neural Networks for Single Image Dehazing
arXiv 2025
YuE: Scaling Open Foundation Models for Long-Form Music Generation
arXiv 2025
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
arXiv 2025
LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
ICCV 2025
MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding
arXiv 2025
Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration
arXiv 2025
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
CVPR 2025 1
Visual Abstract Thinking Empowers Multimodal Reasoning
arXiv 2025
DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms
arXiv 2025
Enhancing Language Multi-Agent Learning with Multi-Agent Credit Re-Assignment for Interactive Environment Generalization
arXiv 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
arXiv 2025
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
arXiv 2025
TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
arXiv 2025
AIGS: Generating Science from AI-Powered Automated Falsification
arXiv 2024
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
arXiv 2024
Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion
arXiv 2024
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
arXiv 2024
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
arXiv 2024
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
arXiv 2024
Model Composition for Multimodal Large Language Models
arXiv 2024
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration
arXiv 2023
Statler: State-Maintaining Language Models for Embodied Reasoning
arXiv 2023
Failures Pave the Way: Enhancing Large Language Models through Tuning-free Rule Accumulation
arXiv 2023
Plug-and-Play Knowledge Injection for Pre-trained Language Models
arXiv 2023
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
arXiv 2023
EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models
CVPR 2024 1
CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors
arXiv 2023
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
arXiv 2023
An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation
arXiv 2022
Packed Levitated Marker for Entity and Relation Extraction
ACL 2022 5
Fully Hyperbolic Neural Networks
ACL 2022 5
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Findings (ACL) 2022 5
Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples
arXiv 2021
RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models
EMNLP 2021 11
Coreferential Reasoning Learning for Language Representation
EMNLP 2020 11
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models
arXiv 2020
DocRED: A Large-Scale Document-Level Relation Extraction Dataset
docred-a-large-scale-document-level-relation-1
FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
fewrel-20-towards-more-challenging-few-shot-1
Affiliations
Frequent co-authors
10from 45 papers