0

Yongbin Li

Papers
43

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
43papers

Authored papers

43

P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling

arXiv 2026

2026

Think Anywhere in Code Generation

arXiv 2026

2026

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

arXiv 2025

2025

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

arXiv 2025

2025

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

arXiv 2025

2025

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

arXiv 2025

2025

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

arXiv 2025

2025

Adaptive Thinking via Mode Policy Optimization for Social Language Agents

arXiv 2025

2025

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute

arXiv 2025

2025

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

arXiv 2025

2025

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

arXiv 2025

2025

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

arXiv 2024

2024

Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

arXiv 2024

2024

Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration

arXiv 2024

2024

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States

arXiv 2024

2024

Self-Retrieval: End-to-End Information Retrieval with One Large Language Model

arXiv 2024

2024

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer

CVPR 2024 1

2024

Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models

arXiv 2024

2024

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

arXiv 2024

2024

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

arXiv 2024

2024

Transferable Post-training via Inverse Value Learning

arXiv 2024

2024

Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement

arXiv 2024

2024

On the Role of Attention Heads in Large Language Model Safety

arXiv 2024

2024

Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?

arXiv 2024

2024

DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories

arXiv 2024

2024

Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement

arXiv 2024

2024

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

arXiv 2024

2024

SoFA: Shielded On-the-fly Alignment via Priority Rule Following

arXiv 2024

2024

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning

arXiv 2024

2024

PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts

arXiv 2023

2023

SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents

spokenwoz-a-large-scale-speech-text-benchmark

2023

Preference Ranking Optimization for Human Alignment

arXiv 2023

2023

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

arXiv 2023

2023

One-Shot Learning as Instruction Data Prospector for Large Language Models

arXiv 2023

2023

Improving Question Generation with Multi-level Content Planning

arXiv 2023

2023

UniSA: Unified Generative Framework for Sentiment Analysis

unisa-unified-generative-framework-for

2023

Iterative Forward Tuning Boosts In-Context Learning in Language Models

arXiv 2023

2023

SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph

arXiv 2023

2023

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

arXiv 2023

2023

UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition

arXiv 2022

2022

Aligning Logits Generatively for Principled Black-Box Knowledge Distillation

CVPR 2024 1

2022

Multi-View Active Fine-Grained Recognition

arXiv 2022

2022

GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 43 papers