Cite
Notes
Only stored in your browser.
Attribution
FEABench: Evaluating Language Models on Multiphysics Reasoning Ability
arXiv 2025
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics
arXiv 2024
from 3 papers
HAO CUI
Nayantara Mudur
Paul Raccuglia
Peter Norgaard
Subhashini Venugopalan
Amil Merchant
Brian Rohr
Chenfei Jiang
Corey Wang
Dan Morris