Neural networks identify natural history specimens with almost perfect accuracy
Image credit: Dreamstime
Artificial neural networks could come in helpful curating vast collections of natural history specimens, after a pilot project has demonstrated that deep neural networks can distinguish between similar-looking plants with up to 99 per cent accuracy.
Universities and museums around the world are in the process of digitising their natural history collections. Many valuable specimens have sat forgotten for decades in pots of formaldehyde in unopened cabinets, but now digitisation has allowed for these specimens to be catalouged and for searchable datasets to be generated.
These digitisation efforts can take years, and require the work of researchers, curators and members of the public volunteering their time. London’s Natural History Museum alone claims to have 80 million specimens of minerals, plants, insects and other extinct and extant animals.
Now, a pilot project reported in Biodiversity Data Journal has suggested that combining these vast digital records with sophisticated deep learning systems could be the key to new insights.
Researchers from the Smithsonian Department of Botany, Data Science Lab and Digitisation Program Office came together with Nvidia, which manufactures graphics processing units for the project. It is among the first in the world to use deep learning to assist with the analysis of digitised natural history samples.
The study made use of deep convolutional neural networks to study a herbarium: a collection of preserved plant specimens and its associated data. A total of 1.2 million specimens – the entire digitised section of the US National herbarium – were analysed by the neural networks.
Deep convolutional neural networks are layered machine learning systems modelled approximately after the activity of brains, and are commonly used to analyse visual information with unprecedented efficiency and accuracy.
The team used two different neural networks: one was trained to recognise herbarium sheets stained with mercury crystals (used by early collectors to protect the plants from insects), and the other was trained to differentiate between two families of plants which look outwardly very similar.
“The results can be leveraged both to improve curation and unlock new avenues of research,” the researchers commented.
The neural networks were capable of performing with 90 and 96 per cent accuracy respectively, or 94 and 99 per cent when adjusted to discard extremely challenging samples.
“This research paper is a wonderful proof of concept,” said Dr Laurence Door, co-author and chair of the Smithsonian Department of Botany. “We now know that we can apply machine learning to digitised natural history specimens to solve curatorial and identification problems.”
“The future will be using these tools combined with large shared data sets to test fundamental hypotheses about the evolution and distribution of plants and animals.”