Robot with a blackboard with equations written on it

AI ‘scientist’ re-discovers scientific equations using data

Image credit: Canva

An artificial intelligence tool has "re-discovered" a series of groundbreaking equations including Kepler’s third law of planetary motion; Einstein’s relativistic time-dilation law, and Langmuir’s equation of gas adsorption.

Researchers at IBM Research, Samsung AI and the University of Maryland, Baltimore County (UMBC) have built an 'AI scientist' able to combine theory and data to discover scientific equations. 

The tool - dubbed 'AI-Descartes' by the researchers - aims to speed up scientific discovery by leveraging symbolic regression, which finds equations to fit data.

Given basic operators, such as addition, multiplication, and division, the systems can generate hundreds to millions of candidate equations, searching for the ones that most accurately describe the relationships in the data.

Using this technique, the AI tool has been able to re-discover, by itself, fundamental equations, including Kepler’s third law of planetary motion; Einstein’s relativistic time-dilation law, and Langmuir’s equation of gas adsorption.

Compared to other similar systems, AI-Descartes' biggest strength is its ability to logically reason, the researchers said. If there are multiple candidate equations that fit the data well, the system identifies which equations fit best with background scientific theory.

The ability to reason also distinguishes the system from generative AI programs such as ChatGPT, whose large language model has limited logical skills and sometimes messes up even basic maths.

“In our work, we are merging a first-principles approach, which has been used by scientists for centuries to derive new formulas from existing background theories, with a data-driven approach that is more common in the machine-learning era,” said lead researcher Cristina Cornelio. “This combination allows us to take advantage of both approaches and create more accurate and meaningful models for a wide range of applications.”

The system works particularly well on noisy, real-world data, which can trip up traditional symbolic regression programs that might overlook the real signal in an effort to find formulas that capture every errant zig and zag of the data. It also handles small data sets well, even finding reliable equations when fed as few as ten data points, the team stated. 

Looking ahead, the team is working on creating new datasets that contain both real measurement data and an associated background theory to refine their system and test it on new terrain.

“In this work, we needed human experts to write down, in formal computer-readable terms, what the axioms of the background theory are and if the human missed any or got any of those wrong, the system won't work,” said co-author Tyler Josephson. “In the future, we'd like to automate this part of the work as well, so we can explore many more areas of science and engineering.” 

Ultimately, the team hopes that AI-Descartes, like the real person, may inspire a productive new approach to science.

“One of the most exciting aspects of our work is the potential to make significant advances in scientific research,” Cornelio said.

The AI has been named after 17th-century mathematician and philosopher René Descartes, who argued that the natural world could be described by a few fundamental physical laws and that logical deduction played a key role in scientific discovery.

The team's findings have been published in the journal Nature Communications

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles