First contact: how linguists use technology to communicate more effectively

Image credit: REX/Shutterstock

The concept of first contact has been played out in books and on our TV screens for decades, but Hollywood’s most recent interpretation gets the closest to how real-life linguists use technology to communicate.

From the moment the majestic, eerily sinister black obelisks descended from the heavens to our first introduction to Amy Adams’ quietly intense linguist Louise Banks, we knew we were in for the full Hollywood ‘alien first contact’ treatment. While the film Arrival’s portrayal is fantastical, it placed technology centre-stage in the process of communicating with the mysterious heptapods. In doing so it wasn’t far off the basic concepts of how real-life linguists begin to communicate in the field.

The concept of first contact with an alien life-form has been a recurring theme in 20th and 21st-century science fiction and the nature of first communication has, likewise, taken a variety of forms. In Steven Spielberg’s Close Encounters of the Third Kind contact is made via the medium of musical notes, communicated by means of sign language which our alien friend happily signs back.

I want to include Ridley Scott’s iconic, genre-busting Alien here, but the first the film’s protagonists knew about said alien was after it had burst out of the heaving chest of an unassuming, if a little hungry, John Hurt, with neither the time nor the place, therefore, for a meaningful conversation to develop. In fact, the communication was explicit: the alien’s sole raison d’être was to stalk and kill everyone it could get its grasping alien hands on.

Then there were the large black monoliths in Arthur C. Clarke’s 2001 – A Space Odyssey; machines built by an unseen extraterrestrial species which could be said to communicate with the life-forms that came into contact with them by subconsciously conveying a meaning that prompts them to progress and develop technologically. Let’s also not forget another Spielberg classic - E.T. the Extra Terrestrial, in which the cute, doe-eyed alien communicates telepathically and symbiotically with his first contact, ten-year-old Elliot.

So Denis Villeneuve’s Arrival seems to make something of a departure to previous form on the theme of first contact and appears to be rooted in the ‘how’, the real-life business of how we would begin to converse with aliens and, without a Star Trek universal translator in sight (yet), how does technology help us to decipher an ‘alien’ language? How do linguists and anthropologists begin to communicate with a previously uncontacted community of people?

Larin Adams is a linguist at SIL International (SIL), a non-profit organisation based in Dallas, Texas, serving language communities worldwide, and is currently working at Payap University in Thailand. Well-versed in the documentation techniques and computational tools used in the analysis of unstudied languages, he commented: “In this time in history we probably will never again have a true first contact situation, by which I mean a situation where a linguist meets a people group and there is no one in that group who does not speak some language of wider communication.

“Generally, some members of most minority groups have become somewhat bi-lingual in a regional language and communication will proceed down that path. However, it was not too many years ago that first meaningful contact situations did occur and the intrepid linguist was faced with building communication with no common language.”

The capacity for language is a uniquely human trait. Whilst many creatures across the animal kingdom can communicate effectively to warn each other of approaching predators, to navigate, to show excitement and fear, or to mimic, only humans can speak with an infinite variety of meaning and nuance.

Adams goes on to describe how today, some linguists emulate the experience of first contact through what are called monolingual demonstrations, developed and popularised by Kenneth L. Pike (you can find examples on YouTube). Often in front of an audience, a linguist is presented with a speaker of an unknown language and not allowed to communicate with him/her in anything but that speaker’s language.

“Having done a couple of these (the first one was a bit unnerving), I can say that it’s actually quite rewarding how fast you can proceed. You present the speaker with common objects and events and then begin building up a vocabulary and basic word order.

“Success is usually achieved because (a) both participants want to communicate and are aware of the artificial context; (b) as a linguist you have a certain set of normal expectations about things like nouns, verbs, allowable word orders along with reliable methods for controlling for extraneous data; and (c) linguists have a very useful universal alphabet (IPA) [the International Phonetic Alphabet, an alphabetic system of phonetic notation based primarily on the Latin alphabet] that allows a single symbol to capture a number of oral gestures, air flow mechanisms, and things like pitch. This alphabet means that instantly the linguist can begin to graphically record speech even if they don’t know what is being said.

“In Arrival, (a) could be assumed. To a lesser degree (b) could also have been assumed as language functions only when meaning and symbols are used consistently. However, (c) would have been useless.”

Damien Daspit is a software developer and computational linguist at SIL, also currently working at Payap University. His primary role is to design and develop software that assists linguists with analysing and translating minority languages. Commenting on the film Arrival, Daspit said, “Despite the fact that Arrival is a science-fiction film about an alien encounter, it is one of the few films that I have seen that presents the real challenges that linguists and translators face when attempting to understand an unstudied language.”

Originally known as the Summer Institute of Linguistics, SIL supports language communities around the world through research, translation, training and materials development. Currently, it works alongside speakers of more than 1,600 languages in over 85 countries, many of which are unstudied and, in many cases, don’t even have a written form.

As Daspit explains: “This presents many challenges to linguists. Native speakers have a difficult time understanding and articulating how their language works. Linguists collect data and then analyse it to discover linguistic processes and structures. SIL has developed software to assist linguists with both the collection and analysis of linguistic data.”

In Arrival, Banks is confronted with a bizarre alien language and is tasked with deciphering it. Spoken, it sounds like the hums and clicks of whale song or dolphin noises, and in written symbols, it’s a circular array of inky patterns known as logograms. Banks starts out by writing down English words and acting out what they mean. The heptapods reply with their logograms and Banks works speedily to find a meaning within the patterns before Armageddon breaks out, or the Oscars come round, whichever happens first.

Technology played a central role in helping Banks and her team to compose words and sentences quickly and easily in the aliens’ complicated ‘alphabet’. Daspit maintains that although the heptapods’ script might seem very different from those used in human languages, there is actually great diversity in how human languages are written and technology plays an important role in helping to render human languages in all of these forms.

“We usually think of human languages as a linear arrangement of discrete symbols, where each symbol represents a sound,” he said. “This is true of many western languages that are based on a Latin script, but is not true of languages from many other parts of the world. Chinese characters are arranged linearly like Latin script, but each symbol represents a word or syllable instead of a sound. In Arabic script, the shape of some symbols change based on the context. In Burmese script, the vowel symbol for a syllable can occur above, below, before, after or even completely surround the consonant.

“SIL works with companies in the technology industry, such as Google, Microsoft and Apple, to help to build the software to render all of the languages of the world no matter how complex they are. SIL has developed fonts and rendering engines for many of the world’s more complicated languages.”

In the real world, when first studying a language, linguists collect word lists, stories, conversations, songs, and other texts. This data is usually collected in audio form and then transcribed. Software, such as SIL’s SayMore and FieldWorks Language Explorer (FLEx) is used to manage and transcribe the data. Linguists then look for consistent patterns in the data at many different levels: phonologically (sound), morphologically (word) and syntactically (sentence).

As Daspit explains: “Sifting through all of the available data can be a daunting task. Our software helps linguists to perform searches, find patterns, and record and test hypotheses. For example, FLEx includes advanced concordancing tools that allows linguists to search for complex syntactic patterns in collected texts. Another SIL application, called Phonology Assistant, helps linguists identify sound patterns, which is useful for developing an alphabet for completely oral languages. Once a linguist finds a pattern, they can start to determine what it means.”

Word list comparison is one of the primary steps a linguist takes in determining if a particular language is a new language or a dialect of another. SIL developed software application Cog to assist language surveyors and linguists in analysing and comparing such word lists.

Daspit describes how Cog automates much of this process by determining if words from two different languages are cognates (derived from the same root word): “Cog has a built-in knowledge of phonetics and can determine the similarity between different sounds. This knowledge helps Cog to find consistent sound changes between languages. These sound changes can be used to determine if two words from two different languages come from the same root. If two languages share many similar words, then they could be related and mutually intelligible to speakers of either language.”

The process of analysing these word lists is very tedious, error prone and time-consuming if done by hand. By automating the process, software applications like Cog allow linguists to spend more time understanding the deeper sociolinguistic relationships between languages, as an expression of our concepts of identity and culture.

The chances of an alien race ever coming to visit Earth is pretty slim, but it’s not unreasonable to think that there might be something out there in the estimated 100 billion galaxies in the Universe, and when they do appear, our tech-savvy linguists will be waiting for them.

Recent articles

Info Message

Our sites use cookies to support some functionality, and to collect anonymous user data.

Learn more about IET cookies and how to control them

Close