0

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses

LLMs can rediscover novel and valid chemistry research hypotheses given a research background and extensive chemistry literature, using a multi-agent framework.

Year
2024
Venue
arXiv 2024
Authors
9
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2410.07076v5ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Scientific discovery plays a pivotal role in advancing human society, and recent progress in large language models (LLMs) suggests their potential to accelerate this process. However, it remains unclear whether LLMs can autonomously generate novel and valid hypotheses in chemistry. In this work, we investigate whether LLMs can discover high-quality chemistry hypotheses given only a research background-comprising a question and/or a survey-without restriction on the domain of the question. We begin with the observation that hypothesis discovery is a seemingly intractable task. To address this, we propose a formal mathematical decomposition grounded in a fundamental assumption: that most chemistry hypotheses can be composed from a research background and a set of inspirations. This decomposition leads to three practical subtasks-retrieving inspirations, composing hypotheses with inspirations, and ranking hypotheses - which together constitute a sufficient set of subtasks for the overall scientific discovery task. We further develop an agentic LLM framework, MOOSE-Chem, that is a direct implementation of this mathematical decomposition. To evaluate this framework, we construct a benchmark of 51 high-impact chemistry papers published and online after January 2024, each manually annotated by PhD chemists with background, inspirations, and hypothesis. The framework is able to rediscover many hypotheses with high similarity to the groundtruth, successfully capturing the core innovations-while ensuring no data contamination since it uses an LLM with knowledge cutoff date prior to 2024. Finally, based on LLM's surprisingly high accuracy on inspiration retrieval, a task with inherently out-of-distribution nature, we propose a bold assumption: that LLMs may already encode latent scientific knowledge associations not yet recognized by humans.

Authors

9