Question 1

What is Lhaw Rlm?

Accepted Answer

LHAW RLM environment: underspecified prompts, simulated user clarification (ask_user), and LLM judge scoring on the ScaleAI/lhaw dataset.

Question 2

What is the current top score on Lhaw Rlm?

Accepted Answer

The top reported score is 43.8% by GPT-4.1 Mini, across 1 model reporting (1 from frontier labs).

Question 3

How can a model improve its Lhaw Rlm score?

Accepted Answer

Tools linked to Lhaw Rlm on Sophon include LHAW RLM RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.

Question 4

What license is Lhaw Rlm under?

Accepted Answer

Lhaw Rlm is available under unknown.

Lhaw Rlm

Top models