Soldiers with army robot

Military bots could become teammates with real-time conversational AI

Image credit: 1st Lt. Angelo Mejia

Researchers from the US Army Research Laboratory and their collaborators have developed a flexible approach for autonomous systems to interpret and respond to ‘Soldier intent’, derived from spoken dialogue.

Spoken dialogue is a natural, human way for people to interact; the ability of autonomous systems (such as military robots) to listen to speech and carry out relevant commands could allow them to become more like teammates than tools. This could be an essential aspect of future military environments.

Army Research Laboratory researchers have been working with colleagues from Devcom and the University of Southern California’s Institute for Creative Technologies to develop an approach for autonomous systems to derive intent from speech and respond appropriately. This technology is the primary component for dialogue processing for the laboratory’s Joint Understanding and Dialogue Interface (JUDI): a prototype tool for allowing bi-directional conversation between soldiers and machines.

“We employed a statistical classification technique for enabling conversational AI using state-of-the-art natural language understanding and dialogue management technologies,” said Dr Felix Gervits of the Army Research Laboratory. “The statistical language classifier enables autonomous systems to interpret the intent of a soldier by recognising the purpose of the communication and performing actions to realise the underlying intent.”

For instance, if a robot were told to “turn 45 degrees and send a picture”, it could interpret the instruction and carry out this task without being explicitly programmed to do so.

The researchers trained their classifier on a labelled data set of human-robot dialogue generated during a search-and-rescue task. The classifier learned to recognise patterns between verbal commands, responses and actions, allowing it to respond appropriately to new commands. They then incorporated the classifier into a broader dialogue management system which includes techniques for determining when to request extra information.

It is hoped that this technology could be applied to combat vehicles and other autonomous systems to enable natural-feeling real-time conversations between soldiers and their machines, with no processing delay in the conversation.

“By creating a natural speech interface to these complex autonomous systems, researchers can support hands-free operation to improve situational awareness and give our soldiers the decisive edge,” Gervits said. “Interacting with such conversational agents requires limited to no training for soldiers, since speech is a natural and intuitive interface for humans and there is no requirement to change what they could say.

“A key benefit is that the system also excels at handling 'noisy speech', which includes pauses, fillers and disfluencies – all features that one would expect in a normal conversation with humans.”

This approach is distinct due to not requiring large and expensive training datasets, its cold start capability for new environments and its specificity for search-and-rescue operations. The classification approach also allows for greater transparency; this is absolutely essential in a military setting in which ethical concerns demand greater transparency and accountability from autonomous systems.

“With the tactical environment of the future likely to involve mixed soldier-agent teams, I am optimistic that this technology will have a transformative effect on the future of the army,” Gervits said. “It is highly rewarding for me as a researcher to see such a tangible outcome for my efforts.”

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles