AI Futures: how artificial intelligence will shape music production
Already part of some major DAWs including Logic Pro, AI and machine learning is becoming a staple of music studios through technology from assisted mixing and search, through to the advent of complete AI DAWs. DJ Mag's digital tech editor explores its impact on music production and engineering in the second part of our three-part AI Futures series
In part one of our AI Futures series, we discussed the looming threats and opportunities around ‘deepfakes’ or style transfers using AI. We spoke to Holly Herndon, a Berlin-based artist who’s been deep in the trenches with AI for many years. We also explored how deepfakes are ushering in a Sampling 2.0 era, and explore how the mistakes of the past have a chance to be rectified for the future. It’s worth reading part one before you continue.
In this section, we’re exploring the impact of AI in the studio. For producers and songwriters, the idea of an autonomous collaborator who makes suggestions for your music or arrangement, helps you write lyrics or simply does the job for you, is generally met with unease. That’s despite the fact AI is already part of some major DAWs including Logic Pro, which uses the tech to automatically detect and create tempo markers as you play. Third party plugins like Google’s Magenta Studio project use machine learning to generate chords, melodies, drum patterns or even a number of bars in the arrangement based on existing MIDI files and some basic parameters set by the user. There are plenty more AI plugins and tools on the market, including Amper, Rhythmiq, Musico – this list of 13 is a good place to start exploring. For this piece, we’re going to focus on iZotope.
iZotope is one of the world’s leading plugin manufacturers, and their tools are used by everyone from Skrillex to Trent Reznor and Just Blaze. In 2014, they began developing a research team that took an unlikely influence for a new ML algorithm to help users mix their music — Facebook.
“Facebook was starting to crack the nut of facial recognition and we became interested in how we could use those techniques for audio applications,” says Jonathan Bailey, iZotope’s Chief Technical Officer. Bailey has been with the company since 2011, back when AI “wasn’t very sexy and buzzy”. The resulting technology — Track Assistant, inside a plugin called Neutron — is based on ML and essentially ‘listens’ to your input, identifies the instrument and makes suggestions around mixing techniques.
“That was really successful for us in a couple of dimensions,” Bailey says. “One, it was a cool technological breakthrough; it was the first example where we really led in the marketing. We branded and positioned this product as powered by machine learning, and that was really interesting to our core audience who represented the nerdier, techier side of the audio community.”
"We have some spirited debates: from a fully manual process to a fully automated process, where is the right set point on that range?" – Jonathan Bailey
MusicTech magazine called it “software that does your mixing”, and the release accelerated conversations that were already bubbling around automation, creativity and AI’s role in the studio. (Automated mastering tools like LANDR and Cloudbounce have been around for a while, but don’t operate within a DAW, and the user has much less control over the output.)
“Candidly, we initially thought of this as being a useful feature for people who had less experience doing music production. It’s an assistant — it’s designed to help them get a good sound, at least a starting point. We have some spirited debates: from a fully manual process to a fully automated process, where is the right set point on that range?"
For producers and engineers, it’s the closest thing to the clichéd ‘robots will steal our jobs’ rhetoric that’s been circulating since robots were first imagined. But is it a legitimate concern? As the line continues to blur between tool and collaborator, we asked Bailey if iZotope felt a sense of responsibility as an industry leader around AI and ML tech for engineers and producers, and what it could mean for the future of music production.
“I get asked this question a lot, and my traditional answer is: as a mix engineer, if you make your living by loading up a session, getting the most generic mix in place and not really applying any of your own creativity and humanity to that and moving on to the next assignment, then I’m sorry, you are going to be replaced by technology. If you offer no spice beyond that, then you are purely solving technical problems, and technology will replace you. But I don’t think there are many people that actually work that way.”
For touring DJs who spend weekends on the road, with limited studio time mid-week, reducing studio admin could be a godsend.
“Even a problem as simple as: you’ve done a recording and opened up your session. The levels are pretty good, there’s not a lot of frequency clashing in the mix — everything kind of generically sounds good. That should be a starting point for creative work, not an ending point,” insists Bailey. “There aren’t too many engineers that I know that wouldn’t love living in that world, where they can just focus on the art part of the job.”
For Bailey, the responsibility falls less on the impact of these tools on creativity, but on the ethics around their misuse for the industry as a whole. Once again, the conversation turns to deepfakes.
"If you make your living by loading up a session, getting the most generic mix in place and not really applying any of your own creativity, then I’m sorry, you are going to be replaced by technology" – Jonathan Bailey
“The bleeding edge right now in research in the world of deep learning is primarily focused on content synthesis,” he continues. “Re-creations and deepfakes would be an example of this, where we can synthesise content that had never existed before. There are some pretty interesting ethical questions that come up on both sides of the equation. Are we using people’s data to create models in an ethical way? That’s really important to me.
“On the flip side, are the applications of our algorithms being used in an ethical way? I don’t know if we can guarantee that,” he admits. “If someone used an iZotope tool to create a Miles Davis solo that never existed, is that ethical or not? Is that good for the world or not? I’m not sure. That’s a challenge as a civilization that we have to confront with the power of these tools.”
Deepfakes may be the most obvious cultural question mark over what’s possible with AI and ML music-making in coming years, but there are other implications for creative individuals. For those who are working to tight deadlines, or whose ears have grown tired from hours in the studio or have been on the road all weekend and need to finish a remix, self-doubt can creep in. Being able to reach for a tool that is, technically, always ‘correct’ or as ‘correct’ as it’s been taught to be, could throw up a fascinating creative dichotomy that plays on the idea of the imposter syndrome that many producers feel, be it their first track or 20 years into their career. Getting a second ‘opinion’ from an AI tool to show you’re at least on the right track is a tempting proposition. The algorithm, after all, ‘knows’ everything, and if you’re working in an untreated room, with entry-level speakers and you don’t have decades of engineering experience behind you, it makes sense you’d be tempted to take the AI’s word over your own. And while collaborating with a fully-formed human is about bouncing ideas, spurring each other on and bringing sometimes-contradictory musical influences to the table, it’s also about sharing the burden of putting your art out into the world. Once AI becomes a more competent collaborator, more conceptual questions will need to be asked. For now though, iZotope’s focus is purely on problem-solving.
“When we design the tools, we don’t see it that way,” says Bailey. “Our mission is to provide tools that enable people to be creative. Audio creation and audio production have gone from a place of being a highly, highly technical discipline and through the invention of disruptive technology has become less and less about the technical space and more about the creative problem space, thereby allowing more people to participate [in it].
“Audio masking between two audio tracks, that is not a particularly creative problem to have. But asking ‘Should the bass or the vocal be more prominent in the mix during the bridge?’ that’s a creative problem and ultimately the creators should be making those decisions, [not us].”
“One risk of AI is that you end up with a song-generating machine that makes some beautiful compositions but it never credits any of the things it used to make it" – Yotam Mann
Is it inevitable then, that over the next few years, every plugin will have some kind of AI functionality?
“There’s a bit of a split in the market in how plugins are designed,” explains Bailey. “There’s the emulation of analogue hardware, and then there are pure software products like iZotope and FabFilter. I’m aware of companies in both categories that are using deep learning, even companies like Univeral Audio whose whole thing is emulating analogue gear, ML gives us techniques to do that more effectively. My opinion is we’re gonna see more and more of it.”
As companies like iZotope and UA embrace the potential of AI both under the hood and as part of the user experience, other companies are taking a more extreme approach, completely re-thinking what AI and ML could mean for the modern DAW and re-writing the rule book on how music is made using computers.
Never Before Heard Sounds was founded in 2020 by Yotam Mann and Chris Deaner. “I felt like there was a huge opportunity for things that were not just showing off the power of AI, but using the power of AI and giving it to actual musicians to make new and interesting music,” says Mann. By putting the musician first, and the tool second, NBHS is aiming to re-think how we interact with AI in the studio and on stage. “As musicians, we’re not interested in automatic music creation technology, which is a lot of the direction AI music production seems to be going in — very few people are trying to build tools to augment musicians instead of replace them.”
As well as collaborating with Holly Herndon on Holly+, they’ve built a website that lets you resynthesise audio from any video on YouTube and convert it into either a choir or a string quartet. Try it yourself here. Remarkably, they’ve also built a prototype piece of hardware that can resynthesise any incoming audio into a model of another sound, in real-time — a world first.
“The angle we take is not to try and wrap it in magic and say some auto-magical AI creature is going to fix your musical problems for you,” continues Mann, “but to try and make the algorithms as transparent as possible and as useful as possible. That’s why we call our tools instruments. They’re meant to be held and played with, and not apply some kind of magical layer on top of your production.”
NBHS co-founder Chris Deaner, an accomplished drummer for the American band Plus/Minus, adds: “In some ways what we’re doing is a lot less sexy than a tool that does everything for you, which was and is the promise of AI and ML. For us, both being musicians, we’re really interested in the human portion of this.”
The human portion of modelling in particular, as we discussed in part one of this series, brings with it its own ethical challenges.
“One risk of AI is that you end up with a song-generating machine that makes some beautiful compositions but it never credits any of the things it used to make it,” says Mann. “So for us, it was really important to showcase the musicians that went into it. We think of these generative models, not as AI entities themselves, but as a conduit between you, the end musician, and the musicians involved in the modelling.”
“It’s not that the machine came up with this amazing approach to production. This has been someone’s craft they’ve cultivated over decades and now you can explore that craft with your own audio” – Yotam Mann
Modelling goes a step further in their vision for a new type of DAW — a digital audio workstation like Ableton, Logic Pro or Pro Tools. DAWs have remained largely static in their fundamental design over the past 20 years, focussing on a linear arrangement page, mixer section and grid-based MIDI editor at the core of most of the software. While GUIs differ and each has its own USP, most wouldn’t be wildly unrecognisable to a user of Cubasis on the Atari in 1989. As consumer laptops become more powerful and cloud computing allows for complex remote processing, is it time to introduce a wholly re-imagined DAW, built from the ground up around AI and ML?
“We are thinking about that,” admits Deaner. “If I take a step back, the mission of this company is to use AI and ML to create new forms of music creation in general. A DAW lets us point at different types of ML, not just [modelling]. We’re working on that right now — it’s in its early phase.”
The grid that currently dominates modern DAWs may be a thing of the past. “We’re reimagining the interface entirely, trying to think about it in a more ‘fun’ way, less grid-based. We’re trying to figure out ways to interact with everything in a brand new way.”
One of the most exciting aspects of an ML DAW is the ability to transfer a certain producer’s mixing style to your own project.
“If you had a producer whose style you really loved, we’re not talking about instruments or sounds, but across a full range of production,” explains Mann. “You could capture their approach to production and re-create that across your DAW. That could mean tweaking every single knob and slider you have on all your plugins, or it could be a full end-to-end DSP plugin that was a producer-in-a-box.”
“It’s a human, on the other end though,” Deaner is quick to re-iterate. “It’s not just ‘AI: The Producer’.”
“That’s a very important part,” agrees Mann. “It’s not that the machine came up with this amazing approach to production. Actually, this has been someone’s craft they’ve cultivated over decades and now you can explore that craft with your own audio.”
From current tools around AI-assisted mixing and stem removal to full-blown machine learning DAWs, the landscape for producers and engineers could be about to see its biggest shake-up for decades.
In part three of this series, which runs tomorrow (7th October), we’ll take a look at DJing, and how AI is influencing how we choose and organise our music, as well as the inevitable rise of the AI DJ. We’ll also explore how highly-personalised generative music tools are becoming a new form of music listening.