An AI technology that generates believable text now produces tunes, too. Both come from OpenAI, a California company dedicated to developing artificial general intelligence. Their newest model is called MuseNet.
OpenAI says MuseNet can generate an original composition up to four minutes long, orchestrated in up to 10 instruments in many styles. The example to the right is a jazz trio.
The system is given a short passage of music that acts as a ‘prompt’ — a new twist on the old idea of ‘hum a few bars.’ But instead of recalling an old favourite, the algorithm takes its cue from the prompt and generates a completely new piece. For example, a country prompt will trigger a tune with the characteristics of country music. The machine determines the features of country music by assessing distinguishing patterns from thousands of country tunes used to ‘train’ the system.
MuseNet generates in many different styles. Examples are in this ‘concert’ made by OpenAI and posted on Twitch. The pieces range in length.
Transforming music and text
The technical term for the AI engine is a ‘large scale transformer model’. The computer predicts what comes next based on the sequence before. Given a set of notes, it predicts the next one, constantly updating and predicting (very quickly) as the piece evolves.
The transformer model works the same no matter if it’s processing notes or words. In February, OpenAI showed its text-generating model, GPT-2. With it came concerns about the possibilities for malicious uses of AI-generated text. Artificial music is less contentious, but the underlying concept is the same: generate ‘new’ based on ‘known.’
The model knows nothing about language or music, only their patterns, and how they vary according to different styles. For example, news reporting and academic writing are different styles of writing, classical and jazz are different styles of music. The system becomes familiar with the characteristic structures and nuances of each by assessing very large collections as examples.
In AI-speak, this is known as unsupervised learning, because the algorithm arrives at an ‘understanding’ on its own.
The result is chameleon-like. The newly generated material takes on the style it’s given: rapid, short, notes for a Bach style, sweeping flourishes in a movie-theme style, etc.
Styles can be blended, too. It can be prompted with Bach and then asked to interpolate Lady GaGa. It isn’t mixing music but patterns. Here is an example of a Bon Jovi style prompted by six notes from Chopin.
As with many AI models, these are early days. The amount generated is determined by algorithmic memory — the extent the algorithm can ‘remember’ earlier sequences. Otherwise, consistency is lost. The company says there can be unsuccessful combinations, such as Chopin with drums. Transformer shortcomings are more noticeable in text, such as repeating words or passages, switching topics unnaturally, or mixing-up worlds, for example describing ‘a fire underwater.’
Still, GPT-2 text was sufficiently coherent over multiple paragraphs to prompt concerns about spreading untruths. No similar flags have gone up about sibling MuseNet except by critics who have split opinions about its contribution to music.
- Computer-generated music is not new — think of all those drum-machines or melody tracks for solo-performers. This breakthrough is about a machine composing by itself, in many musical styles.
- The bigger advance is demonstrating there can be more than one kind of generative output from the same AI model, in other words, the same model generating either music or text.
- A future phase of AI, when the same machine is capable of many/most things, is known as artificial general intelligence, or AGI. This outcome is not AGI, but it is significant that it comes from a company that directs its research towards achieving AGI and is a step along that path.
- Music is safer than text for experimentation. Sour notes are likely the worst outcome when a new trial falls short. With artificial text, the stakes are much higher until there are better methods of detecting fakery.
- There will be several knock-ons: When an algorithm generates text or music that is original, does it constitute creativity? How is computer-generated text or music treated in copyright law? If it disaffects or infringes on the works of others, who is responsible?
SEE RELATED STORIES
OpenAI’s MuseNet generates AI music at the push of a button
THE VERGE | April 26, 2019 | by Jon Porter |‘Lady Gaga’s Poker Face in the style of Mozart? Sure, why not’
This AI-generated musak shows us the limit of artificial creativity
MIT TECHNOLOGY REVIEW | April 26, 2019 | by Will Knight | ‘A powerful AI algorithm can dream up music that echoes Bach or the Beatles, but it isn’t real creativity.’
MuseNet generates original songs in seconds, from Bollywood to Bach (or both)
TECHCRUNCH | April 25, 2019 | by Devin Coldeway | ‘What makes it impressive is that a single model does this reliably across so many types of music.’
OpenAI’s MuseNet AI generates novel 4-minute songs with 10 across a range of genres and styles
VENTUREBEAT | April 25, 2019 | by Kyle Wiggers | ‘…a small but noteworthy step forward in autonomous music generation research…’