Philipp Mondorf
- Papers
- 3
Cite
Notes
Only stored in your browser.
3papers
Authored papers
3LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
arXiv 2024
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
arXiv 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 3 papers