0

Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

A pipeline for constructing datasets and Direct Preference Optimization with curriculum learning enhance LLMs' ability to follow soft constraints.

Year
2025
Venue
arXiv 2025
Authors
8
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2501.04945v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

It is crucial for large language models (LLMs) to follow instructions that involve multiple constraints. However, soft constraints are semantically related and difficult to verify through automated methods. These constraints remain a significant challenge for LLMs. To enhance the ability of LLMs to follow soft constraints, we initially design a pipeline to obtain high-quality outputs automatically. Additionally, to fully utilize the acquired data, we introduce a training paradigm based on curriculum learning. We experimentally evaluate the effectiveness of our methods in improving LLMs' soft constraint following ability and analyze the factors driving the improvements. The datasets and code are publicly available at https://github.com/Rainier-rq/FollowSoftConstraints.

Authors

8