Yongbin Li
- Papers
- 43
Cite
Notes
Only stored in your browser.
Authored papers
43P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling
arXiv 2026
Think Anywhere in Code Generation
arXiv 2026
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction
arXiv 2025
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis
arXiv 2025
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning
arXiv 2025
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
arXiv 2025
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking
arXiv 2025
Adaptive Thinking via Mode Policy Optimization for Social Language Agents
arXiv 2025
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute
arXiv 2025
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence
arXiv 2025
RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization
arXiv 2025
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
arXiv 2024
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
arXiv 2024
Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration
arXiv 2024
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
arXiv 2024
Self-Retrieval: End-to-End Information Retrieval with One Large Language Model
arXiv 2024
Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer
CVPR 2024 1
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
arXiv 2024
DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
arXiv 2024
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
arXiv 2024
Transferable Post-training via Inverse Value Learning
arXiv 2024
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement
arXiv 2024
On the Role of Attention Heads in Large Language Model Safety
arXiv 2024
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?
arXiv 2024
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories
arXiv 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
arXiv 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
arXiv 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
arXiv 2024
Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning
arXiv 2024
PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts
arXiv 2023
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
spokenwoz-a-large-scale-speech-text-benchmark
Preference Ranking Optimization for Human Alignment
arXiv 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
arXiv 2023
One-Shot Learning as Instruction Data Prospector for Large Language Models
arXiv 2023
Improving Question Generation with Multi-level Content Planning
arXiv 2023
UniSA: Unified Generative Framework for Sentiment Analysis
unisa-unified-generative-framework-for
Iterative Forward Tuning Boosts In-Context Learning in Language Models
arXiv 2023
SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph
arXiv 2023
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
arXiv 2023
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition
arXiv 2022
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation
CVPR 2024 1
Multi-View Active Fine-Grained Recognition
arXiv 2022
GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection
arXiv 2021
Affiliations
Frequent co-authors
10from 43 papers