Can AI be music to our ears?
Image credit: New York Times / Redux / Eyevine / Dreamstime
From ‘deepfakes' of hits by Elvis and Nirvana, to real-time soundtracks for streamers, to otherworldly sounds generated by cutting-edge instruments, AI is transforming the way music is created and experienced.
When AI music researchers Bob L Sturm and Oded Ben-Tal realised they had created algorithms effective enough to imitate traditional Irish music compositions, the pair came up with an ingenious idea for an experiment.
They would hire professional musicians to record an album, ‘Let’s Have Another Gan Ainm’, using material generated by a computer trained on over 23,000 transcriptions of traditional music. UK musician Daren Banarse was drafted to curate the work. In a cunning ruse, CDs were sent to various critics in the US and Europe with an elaborate fabricated backstory printed on the album sleeve, attributing the tunes to the Ó Conaill family, including daughters Caitlın and Una.
Contrary to expectations, the release received almost universal acclaim, says Sturm, an associate professor at KTH Royal Institute of Technology in Sweden: “The responses were very positive; one said: ‘This is really great. You’re doing a great service to Irish traditional music by releasing this album’. The album got radio play in the US and one of the counterfeit band members even received private messages of support.”
The success of the experiment highlights not only how folk music is an excellent sandbox to explore AI music, but the extent to which the technology has advanced in recent years.
The story of artificial intelligence and music dates to the 1950s when Alan Turing, the godfather of computer science, built a machine that generated three simple melodies. Subsequent advances have included the first paper on algorithmic music composition using the Ural-1 mainframe computer, by R Kh Zaripov, and a computer program developed in the 1990s able to write new compositions in the style of Bach.
Progress rapidly accelerated in the 2010s, thanks in part to the work of devoted research teams at universities, investments from major tech companies and knowledge-sharing at machine-learning conferences like NeurIPS.
AI music systems now exploit artificial neural networks, modelled on the human brain, which adapt and learn from past patterns. When a large quantity of audio files or musical scores are fed in, the systems analyse the data, identify patterns such as melodies, drum breaks or chord progressions and produce new material in a similar style. Musicians and producers work with the output to edit and enhance it using conventional techniques to arrive at a final composition.
“AI today is getting better at understanding long-term structure,” explains AI researcher and composer Hendrik Vincent Koops, who led the judging panel for the AI Song Contest 2021. “Just as natural language processing is able to generate longer pieces of text that look like they might have been written by a human, music generation can handle much longer structures, where before it could only generate small snippets of a melody or small chord sequences.”
The technology is also driving innovation in audio synthesis. AI models were previously only able to output symbolic representations of music, like sheet music, but thanks to better computational power they can now generate audio directly.
Many musicians see AI as the jumping-off point for a new era of creativity, able to push music in new and inspiring directions. Collaborations between artists and scientists have spurred the development of more intuitive software and instruments designed to democratise AI and make it more accessible.
On the flip side, others are wary of the implications of such a powerful computational force. AI has already replaced jobs across various industries. The idea that an algorithm could optimise such a deeply personal and emotive medium is disturbing.
Will the pop stars of the future be sophisticated algorithms, able to enchant listeners with their far-out creations, while human musicians languish in the side-lines, considered quaint artefacts of the past?
The mass proliferation of content creators on streaming and social media platforms, including YouTube, Twitch and podcasts, has accelerated demand for professional-sounding music that can be played royalty-free without the danger of infringing copyright.
New AI tools allow businesses and individuals without any previous experience of composition or production, to configure their own background music in just a few clicks and simultaneously side-step the complexities of broadcast rights clearance.
Amper was set up by three former Hollywood film composers to enable users to instantly create entire tracks by specifying the length and structure, tempo, instruments, and mood. Additional editing tools make it possible to tweak tracks to hone the sound.
Through a similar process, London-based tech start-up Jukedeck composes AI tracks in seconds using neural network technology. The company was recently acquired by TikTok’s parent company Bytedance, fuelling rumours that TikTok users will soon be able to add machine-generated music to their videos.
California-based Infinite Album has developed an AI-powered real-time video-game soundtrack replacement for livestreamers. Designed to replace regular in-game music, which has led to frequent copyright ‘strikes’ and Twitch streamers losing their channels, it enables users to set their own parameters for music, such as style, mood, song/melody, tempo and intensity, then changes and adapts in response to live events in the game.
“If you’re walking through a forest in a game, you might hear happy folk music, but then you get attacked by thieves and the music could change to intense, maybe electronic music,” explains Ryan Groves, founder of Infinite Album and long-term developer of intelligent music systems.
Changes to music are triggered by instructions from a games-modding (modifying) platform, which relays data on things like the player’s health, whether a fight is starting, or if the player has killed somebody, or died. Onscreen controls also enable both the streamer and viewers to manipulate the sounds in real-time, opening new possibilities for games interaction and engagement.
Infinite Album’s AI exploits natural language processing techniques and probabilistic models, including hidden Markov models. Compositions are generated on-the-fly by synthesised instruments through a process of ‘infinite variation’ whereby any controller changes, for example, when an emotion wheel is turned from happy to sad, triggering the system to recompose the music and sound design.
“Every time we talk to streamers, they come up with new ways the technology could be interesting,” says Groves. “They can choose a specific track for a map in a game. One guy had the idea that viewers could change the music to children’s music for 30 seconds if he wasn’t paying attention and his character died.”
For AI-sceptics, automated software for music making on-demand represents a worrying trend that pre-empts a world where music is standardised, formulaic and divorced from human emotion.
However, many musicians are harnessing AI to achieve precisely the opposite effect: to push against established musical norms and uncover new forms of expression supported by the latest cutting-edge technology.
“This will be a pivotal time for AI and creativity. The love of old and new technology, waiting to be re-imagined and re-explored,” says Carol Reiley, co-founder of Deepmusic.ai, an organisation set up to connect AI and arts communities to help shape the future of the technology.
Deepmusic.ai encourages artists and computer scientists to experiment together with AI and runs an opensource database for AI creative projects across music generation, public performance and painting, including information on funding sources, how the algorithms work, and the datasets they are trained on.
Contrary to what you might expect, Reiley says human composers (in the realm of instrumental/orchestral music) typically must work hard to collaborate with raw AI output. “They are essentially learning a new instrument of creativity – one without an established linear tradition,” she says.
Where AI teams tend to optimise for listenability, creating clips that adhere to popular motifs or certain keys, she says composers tend to find the ‘failure cases’ that lie outside of those boundaries more interesting.
‘AI is a very fast way to get out of your own head and generate a bunch of ideas connected to an idea you give it. That’s the interesting part: it shows you the new directions you could take.’
Using digital technology as a catalyst for new ideas and creative direction is a familiar experience for bands and electronic musicians. Many already exploit advanced software synths and samplers as triggers for inspiration. The AI Song Contest 2021 saw 38 groups from across the world co-create with AI to produce four-minute songs that capture the best of human and machine creativity.
The winning entry, ‘Listen To Your Body Choir’ by M.O.G.I.I.7.E.D., exploited a variety of AI models and plug-ins to produce a haunting pop song with distorted vocals, discordant melodies and a stilted drum-machine beat. The song takes inspiration from ‘Daisy Bell’, the first song ever ‘sung’ by a computer in 1961.
The transformer-based language model GPT-2, developed by OpenAI, was trained on a massive dataset to generate lyrics that ‘continue’ a line from ‘Daisy Bell’. The Melody-RNN language model for musical notes created a melody to extend the song’s melody. Another AI model was trained on an audio recording of the band’s singer Brodie to produce a stream of bits and syllables used to enhance and morph the vocals.
According to band member and music/AI researcher Jon Gillick, the most successful machine-learning output was the lyrics. The song’s title came from GPT-2. “We didn’t expect the lyrics to be of much interest, but when we saw those words in that combination we thought ‘OK, that has to be the chorus’.”
None of the AI tools employed were advanced enough to generate a song in its entirety, he adds: “It was really more about figuring out ways to get inspiration or ideas we wouldn’t have thought of otherwise. If you’re stuck, AI is a very fast way to get out of your own head and generate a bunch of ideas connected to an idea you give it. That’s the interesting part: it shows you the new directions you could take.”
Some artists believe AI will help shape entire new musical genres or even transform the way music is consumed. Back in 2012, researchers at Goldsmiths, University of London, devised a revolutionary new commercial music format, Bronze, that used algorithms to manipulate and transform songs so that every listen is unique. The project demonstrated how recorded music could be freed from its traditionally static existence.
Intrigued by how the format could connect live and recorded music like never before, Venezuelan-born singer and record producer Arca (a previous collaborator with Kanye West, Björk and FKA Twigs) used an updated incarnation of Bronze to turn the track ‘Riquiqui’ into a dynamic, ever-transforming representation of itself.
The continuous morphing soundscape now forms part of multi-media installation in the lobby of New York’s Museum of Modern Art, by French artist Philippe Parreno.
From Mozart’s piano to the early use of synthesisers by jazz legend Herbie Hancock, to the mangled beats of Aphex Twin, musicians have long pioneered new technologies that expand the possibilities of sound design. Now AI and machine learning are helping shape entirely new sounds that empower musicians to express themselves in new ways.
German IDM pioneers Mouse on Mars have dedicated over 20 years to exploring electronic music, and for their latest project the band developed a custom AI, with studio Birds on Mars, that turns speech into a musical instrument. The technology formed part of their headline live performance at the AI and Music Festival in Barcelona in October.
Google Magenta, a research project exploring the role of machine learning as a tool in the creative process, has produced a collection of plugins for music generation built on its opensource tools and models. These can be played by anyone at home on a standard PC.
The musical instrument NSynth Super, developed by Magenta with Google Creative Lab, takes a novel approach to music synthesis. Unlike a traditional synthesiser, which generates audio from components, like oscillators and wavetables, it uses deep neural nets to learn the characteristics of sounds, then create a completely new sound based on them.
Rather than simply combine or blend sounds, entirely new sounds are synthesised based on the acoustic qualities of input samples, like a flute and a sitar. Musicians can also manipulate the audio using built-in controls and a touch pad.
Yet according to Rob Clouth, a musician working at the vanguard of generative music, the goal with AI shouldn’t necessarily be to create entirely original sounds. Speaking at the AI and Music Festival, he said: “If sound design is completely alien, you can’t quite connect with it. For me, an interesting sound is something that relates to something you already know and can locate within your experience. I like the idea of interpolating between different sounds in a latent space because it gives you that ‘uncanny valley’ feeling [the strange juxtaposition and tension between human and machine], which allows you to be more emotionally affected by music.”
Musicians working with AI often see it as a tool to enhance creation and a democratising force, but others worry the technology will become more dominant and ultimately displace musicians from their jobs.
In a podcast interview in 2019, the musician and record producer Grimes made the controversial prediction that we may be nearing the end of human art, saying: “Once there’s actually AGI (artificial general intelligence), they’re gonna be so much better at making art than us.”
Her ideas may have been influenced by then boyfriend Elon Musk, whose project OpenAI went on to launch arguably the world’s most powerful AI music-generation system, Jukebox.
Trained on 1.2 million song recordings scraped from the web, the neural net system generates audio, including rudimentary singing, to mimic various genres and artist styles. It was used to make ‘deepfakes’ of songs by Katy Perry, 2Pac, Elvis, Simon and Garfunkel, Céline Dion, and others.
The results are impressive, though frustrating, with authentic-sounding vocals marred by garbled lyrics and strange shifts in melody and harmony. User comments on Soundcloud for the deepfake track ‘Classic Pop’, in the style of Elvis Presley, humorously convey the listening experience: “This is what Elvis sounds like after he’s dead”; “Haunted data wants to sing.”
Most experts agree AI is still a long way from being able to create original hit songs independently. A significant part of the challenge is the sheer complexity of musical structure and styles and the inability to capture and define them in algorithms.
Henrik Vincent Koops explains: “Song writing is very hard even for humans. If you ask somebody who wrote a good song how they did it, they’ll tell you they had some inspiration, or they thought this idea was nice. They can’t really express it in mathematical terms to inform an AI model.”
Songs are inspired by other intangible factors like human emotion and experience and complex socio-cultural influences, which a computer finds incredibly hard to decipher and interpret into music.
It’s also logical to ask if people’s own internal ‘programming’ makes them inherently averse to the idea of listening to music produced by a machine.
In an interesting twist in the tale of Bob Sturm’s Irish folk music experiment, a mainstream UK news website published a story criticising the album as a fake made by ‘bot villains’. When Sturm was asked to provide an audio excerpt of machine-generated music to include with the piece, he instead sent a clip of a ‘real’ traditional Irish jig.
Responses from readers in the comments section were revealing, he says: “People said things like ‘it sounds robotic’, ‘there’s not much warmth’, ‘this is against the nature of what true music is’. It was interesting to see the bias that people have to the idea that machines might create music that is beautiful, or in some sense wonderful.”
Whether or not humans are ready to embrace pure AI music is uncertain, but in the meantime, musicians and researchers will continue to push the boundaries in the hope that one day they do.
Development of AI
The intertwining of artificial intelligence and music might appear super-sophisticated and intangible, but in fact it represents the latest technological stage of a long history dating back to the ancient Greeks.
570 BC: The Ionian Greek philosopher Pythagoras discovered that music is deeply mathematical and that numbers can be represented as melody using numbers and ratios.
1951: Alan Turing, the godfather of computer science and creator of the Enigma code-breaking machine in the Second World War, built a machine that generated three simple melodies, including ‘Baa, baa black sheep’ and ‘God save the King’.
1960: Russian researcher R Kh Zaripov published the first paper on algorithmic music composing using the Ural-1 mainframe computer.
1965: Inventor Ray Kurzweil premiered a piano piece created by a computer capable of pattern recognition, which was then able to analyse and exploit the patterns to create novel melodies. The computer debuted on Steve Allen’s ‘I’ve Got a Secret’ programme in the US.
1979: The influential electronic synthesiser Fairlight CMI was the first able to record and play back samples at different pitches, coining the term ‘sampling’.
1997: The AI program Experiments in Musical Intelligence (EMI) appeared to outperform a human composer in the task of composing a piece of music imitating the style of Bach.
2018: Music collaborative SKYGGE, led by French pop artist Benoit Carré, releases ‘Hello, World’, the first music album composed with the help of AI technology, based on research by François Pachet.
2019: Experimental singer-songwriter Holly Herndon received acclaim for ‘Proto’, an album in which she harmonised with an AI version of herself. The American electro/synth pop band Yacht teamed up with Google Magenta to create the Grammy-nominated AI-assisted album ‘Chaintripping’.
2020: The Jukebox project, run by Elon Musk’s research and deployment company OpenAI, generates the first ‘deepfake’ songs, complete with lyrics, seemingly performed by dead stars including Frank Sinatra and Elvis Presley.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.