0

Peng Li

Papers
45

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
45papers

Authored papers

45

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

arXiv 2026

2026

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

arXiv 2026

2026

UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass

arXiv 2026

2026

GigaWorld-Policy: An Efficient Action-Centered World--Action Model

arXiv 2026

2026

Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models

arXiv 2026

2026

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

arXiv 2026

2026

Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

arXiv 2026

2026

U-Net-Like Spiking Neural Networks for Single Image Dehazing

arXiv 2025

2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation

arXiv 2025

2025

Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control

arXiv 2025

2025

LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

ICCV 2025

2025

MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding

arXiv 2025

2025

Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration

arXiv 2025

2025

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

CVPR 2025 1

2025

Visual Abstract Thinking Empowers Multimodal Reasoning

arXiv 2025

2025

DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms

arXiv 2025

2025

Enhancing Language Multi-Agent Learning with Multi-Agent Credit Re-Assignment for Interactive Environment Generalization

arXiv 2025

2025

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

arXiv 2025

2025

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

arXiv 2025

2025

TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

arXiv 2025

2025

AIGS: Generating Science from AI-Powered Automated Falsification

arXiv 2024

2024

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models

arXiv 2024

2024

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

arXiv 2024

2024

StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models

arXiv 2024

2024

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

arXiv 2024

2024

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

arXiv 2024

2024

Model Composition for Multimodal Large Language Models

arXiv 2024

2024

A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration

arXiv 2023

2023

Statler: State-Maintaining Language Models for Embodied Reasoning

arXiv 2023

2023

Failures Pave the Way: Enhancing Large Language Models through Tuning-free Rule Accumulation

arXiv 2023

2023

Plug-and-Play Knowledge Injection for Pre-trained Language Models

arXiv 2023

2023

Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf

arXiv 2023

2023

EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

CVPR 2024 1

2023

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

arXiv 2023

2023

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

arXiv 2023

2023

An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation

arXiv 2022

2022

Packed Levitated Marker for Entity and Relation Extraction

ACL 2022 5

2021

Fully Hyperbolic Neural Networks

ACL 2022 5

2021

MoEfication: Transformer Feed-forward Layers are Mixtures of Experts

Findings (ACL) 2022 5

2021

Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples

arXiv 2021

2021

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

EMNLP 2021 11

2021

Coreferential Reasoning Learning for Language Representation

EMNLP 2020 11

2020

CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models

arXiv 2020

2020

DocRED: A Large-Scale Document-Level Relation Extraction Dataset

docred-a-large-scale-document-level-relation-1

2019

FewRel 2.0: Towards More Challenging Few-Shot Relation Classification

fewrel-20-towards-more-challenging-few-shot-1

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 45 papers