0

VimGolf: Evaluating LLMs in Vim Editing Proficiency

Active

A benchmark that evaluates LLMs in their ability to operate Vim editor and complete editing challenges. This benchmark contrasts with common CUA benchmarks by focusing on Vim-specific editing capabilities.

Domain
Reasoning
License
mit
Published
Oct 2025
Notable for
Benchmark for evaluating Reasoning.

Cite

Notes

Only stored in your browser.

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is VimGolf: Evaluating LLMs in Vim Editing Proficiency?
A benchmark that evaluates LLMs in their ability to operate Vim editor and complete editing challenges. This benchmark contrasts with common CUA benchmarks by focusing on Vim-specific editing capabilities.
How can a model improve its VimGolf: Evaluating LLMs in Vim Editing Proficiency score?
Tools linked to VimGolf: Evaluating LLMs in Vim Editing Proficiency on Sophon include Vimgolf PIT RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is VimGolf: Evaluating LLMs in Vim Editing Proficiency under?
VimGolf: Evaluating LLMs in Vim Editing Proficiency is available under mit.