AI in Music: A Nuanced Perspective
Introduction
There is a conversation happening right now about AI and music. It is loud, it is everywhere, and most of it is wrong. Not wrong in the sense that the people having it are stupid, but wrong in the sense that the framing is broken from the start. People are arguing about whether AI is good or bad for music as though that is a question with a single answer. It is not. It never was. And until we fix the framing, the conversation goes nowhere useful.
I am an independent musician. I have released many albums across several genres over the past few years, entirely self-produced, mixed, and mastered, funded out of my own pocket, distributed to every major streaming platform, and earned approximately $8.29 in royalties across that entire body of work. I also work full time as a software developer, which means I understand AI not just as a music consumer or commentator but as someone who uses it daily in a professional context. I am not writing this to be contrarian. I am writing it because the debate deserves better than what it is currently getting.
The Framing Problem
The most common version of this debate goes roughly like this. Someone with a large platform, usually a professional musician or audio engineer with decades of experience and an established income, makes a video or writes an article arguing that AI is destroying music. They cite generative tools like Suno, warn about mediocrity flooding streaming platforms, and conclude that AI is a threat to talent, craft, and the soul of the art form. Comments section erupts. Everyone picks a side. Nothing is resolved.
The problem is not that these people are wrong about everything. Some of what they say is accurate. The problem is that they are arguing from a single vantage point and presenting it as a universal truth. They are professional mix engineers worried about their client base, or established artists worried about the devaluation of their craft, and those are legitimate concerns. But they are not the only concerns, and the people who have them are not the only people making music.
The framing collapses the moment you ask a simple question: who is actually using these tools, and what are they using them for? The answer to that question changes the ethical and practical calculation completely.
Where AI Genuinely Helps
Let me start with the honest version of this because it rarely gets said clearly enough.
AI tools have been part of music production for years and nobody had a problem with them until recently. iZotope's RX has been cleaning up audio, removing mouth clicks, repairing stems, and doing things that would have taken hours of manual work in a fraction of the time. Nobody calls that a threat to craft. iZotope's Ozone analyzes your master and gives you a starting point for EQ, compression, and limiting. Professional mix and mastering engineers use it regularly, including many of the same people currently warning about AI destroying the industry. iZotope's Neutron analyzes your full session, detects instruments, suggests levels, cleans up frequency masking between competing elements, and gives you a workable starting point for a mix. These are AI tools. They have been AI tools for years. The conversation about whether AI belongs in music production is already settled. It does. It has for a long time.
The more honest question is which AI tools, used by whom, for what purpose, and with what level of human judgment and craft applied on top.
For an independent artist working alone, often after a full day job, with no budget for professional services and no label backing, these tools are not shortcuts around craft. They are accelerators of it. The grunt work, cleaning up mud in a mix, unmasking competing instruments, getting a rough level balance across a session, these are tasks that require understanding but not inspiration. Getting them handled efficiently frees up time and attention for the things that actually require creative judgment: the automation that makes a chorus lift, the reverb that puts the vocal in a room, the compression that gives a drum bus its character, the arrangement decisions that make a track feel like it goes somewhere.
This is identical in principle to a software developer using AI tools to write unit tests and handle code reviews. The boilerplate, the repetitive, the formulaic, gets offloaded so the developer can focus on architecture, problem solving, and the creative decisions that actually require expertise. Nobody is arguing that this makes the developer less of a developer. The same logic applies in music production.
The argument that these tools should only be in the hands of experienced professionals ignores the reality of how expertise is actually acquired. Three years ago I did not know what a compressor did. I did not know what a sidechain was, what multiband compression achieved, or how to use a limiter without destroying transients. I learned those things by putting in the time, watching channels like Mixbus TV, Studio Life, and White Sea Studio, reading, experimenting, failing, and trying again. The automated tools I tried early on did not get me where I wanted to go, and that failure was the thing that sent me deeper into the actual craft. In that sense the limitations of the tools were an education in themselves. They showed me the gap between where I was and where I needed to get to, and the only way to close that gap was to actually learn.
That is not a story about AI replacing craft. It is a story about AI being one node in a longer learning journey.
Where AI Genuinely Does Not Help
None of the above applies to generative song creation tools. This is where the debate needs to be much clearer about what it is actually talking about.
Tools like Suno are not AI in the sense of an intelligent collaborator. They are pattern recognition systems trained on existing recordings, optimized to produce outputs that match statistical regularities in their training data. Feed them a prompt, get a song. The song will be competent in the way that a photocopier is competent. It will reproduce the general shape of what it has seen without understanding any of it. The lyrics will be generic. The structure will be predictable. The emotional content will be approximate at best.
The phrase that best describes the output is slop trained on slop. The majority of commercially produced music over the past two decades has been formulaic, producer-driven, and optimized for streaming metrics rather than artistic depth. That is what these systems were trained on. Of course the output reflects it. You cannot blame the photocopier for the quality of what you put in front of it.
This does not make generative tools useless. A B-movie producer who needs thirty seconds of background music for a wide pan shot does not need art. They need functional audio that does not distract from the movie scene. Someone who wants to send a funny song to a friend does not need a Grammy nomination. They need something that makes the joke land. For those use cases, Suno and tools like it are perfectly adequate. The problem is not the tools. The problem is confusing those use cases with music making as a creative act.
The deeper issue with generative tools is that they are not truly collaborative. You cannot have a conversation with it about what the track needs and refine it together through iteration and trial and error. You get a finished output and if you do not like it you start again. That is not a creative workflow. That is a vending machine.
The mediocrity argument also contains a contradiction worth naming. Critics of AI music often argue simultaneously that AI produces mediocrity and that this mediocrity threatens the value of talent. But those two claims do not coexist comfortably. If the output is mediocre it is not competing with genuine talent. It is noise. Real talent has always had to cut through noise. The charts were already full of throwaway pop that nobody will remember in five years before AI generated a single note. AI does not change that problem. It may scale it. But scale is not the same as a fundamental shift in the underlying dynamic.
Not All AI Is Created Equal
There is a distinction worth making explicit here because it gets lost in the noise of the broader debate. The question is not just how AI tools are used but how they were built in the first place.
Tools like Suno were trained on vast catalogues of recorded music without the consent of the artists who made it. The VC backing Suno has openly celebrated the absence of licensing deals as a feature rather than a problem. That is not a grey area. That is strip mining an industry, hoovering up the creative output of generations of musicians, using it to train a system designed to compete with those same musicians, and returning nothing to them in the process. Whatever your view on AI in music generally, that specific model of development is ethically indefensible.
This matters because it colours everything downstream. When Suno generates a track, that track is built on a foundation of unconsented labour. The artists whose work formed the training data have no knowledge of it, no say in it, and no share of whatever value gets extracted from it.
Contrast that with platforms like Voice-Swap, whose Creative Director Benn Jordan is a well known advocate for ethical music industry practices, having consulted for Bandcamp and built an educational platform specifically focused on music industry economics. Voice-Swap is built around artist consent and fair compensation as foundational principles, not afterthoughts. ElevenLabs similarly has built mechanisms for voice artists to voluntarily contribute to and be compensated from their voice library. These are not the same thing as Suno. Treating them as equivalent because they both involve AI is like treating a fair trade cooperative and an operation built on exploited labour as equivalent because they both produce the same product.
The ethical question around AI in music is therefore not simply about the output. It is also about the supply chain. Where did the training data come from? Were the artists who created it compensated or even consulted? Is the tool designed to empower creators or to extract value from them and redistribute it upward?
These are the questions worth asking. And the answers vary enormously depending on which tool you are talking about.
The Flagging Fallacy
There is an argument that surfaces regularly in this debate that AI generated music should be flagged or labeled on streaming platforms, so that listeners know what they are hearing. On the surface this sounds reasonable. On examination it falls apart immediately.
Nobody is proposing that tracks made with VST instruments be flagged to indicate that no real piano was played. Nobody is flagging tracks where the drums came from a sample library rather than a live drummer. Nobody is flagging Auto-Tune, quantization, or Melodyne pitch correction, all of which alter or replace human performance. Nobody is flagging tracks where the bass was played by a software synthesizer instead of a human bassist.
All of these tools simulate, replace, or substantially alter human performance. They have been standard in music production for decades. The reason nobody flagged them is that the argument for flagging them does not hold up: what matters is the artistic intent and the quality of the result, not the specific tools used to achieve it.
Flagging AI generated tracks while not flagging any of these other tools is not a principled position. It is an emotional reaction dressed up as an ethical stance. The question that never gets asked is: what specifically is the listener being protected from by knowing a track was AI assisted? If the answer is mediocrity or slop, then the flag should go on a lot more than just AI music. There are many artists that the slop tag could be assigned to. You see where this is headed, don’t you? Not a good place. Just use your ears. If you like a track then enjoy it for what it is. No matter how it was generated. If you don’t like it then move on. Simple.
The Economic Reality
Here is the number that reframes this entire conversation: $8.29.
That is what a new independent artist, like me, with a meaningful catalogue of releases, distributed across every major streaming platform, can expect to approximately earn over several years of having music available. Spotify pays between $0.003 and $0.005 per stream. Apple Music pays slightly more. The numbers are so small as to be functionally meaningless for anyone not already operating at significant scale.
This matters because the entire debate about AI threatening professional mixing and mastering engineers assumes a world where independent artists are potential clients for those engineers. But an artist earning $8.29 from streaming is not hiring a professional mix engineer. That was never on the table. The choice is not between AI assisted mixing and professional mixing. The choice is between AI assisted mixing and doing it yourself as best you can.
When a professional audio engineer argues that AI mixing tools are a threat to the industry while simultaneously defending their own use of AI powered plugins, thumbnail generators, and productivity tools on the grounds that they save time and money, they are applying a framework that only works from their economic position. From the position of the independent artist with a day job and no budget, the framework inverts completely. The tool that saves the professional time is the same tool that makes the independent artist's project possible at all. The ethical calculation is identical. The user is just different.
This is not an argument against professional mixing and mastering. Those services are genuinely valuable and the expertise behind them is real. It is an argument for intellectual honesty about who is actually in the conversation and what constraints they are actually operating under.
The streaming platform question sits adjacent to this. Platforms like Spotify have extracted enormous value from the music ecosystem while returning almost none of it to the artists at the bottom. An independent artist uploading AI generated music to Spotify is not destroying the music industry. The music industry's own pricing decisions did that particular job years ago. If anything, flooding an extractive platform with low cost content until the economic model collapses seems like a reasonable response to being paid $8.29 for years of work.
A Developer's Perspective
By day I am a software developer. I cannot afford to be a full time musician. If we are talking about which profession is most immediately threatened by AI right now, it is arguably mine rather than music production. Junior developers are already struggling to find work, contracts are drying up, and companies are increasingly expecting their existing developers to do more with AI agents rather than hiring additional headcount. The landscape is shifting fast. The systems these AI coding tools were built on were trained on billions of lines of publicly available code written by developers like me, from open source repositories, Stack Overflow, and public GitHub repositories, which raises its own version of the consent and compensation questions being asked about music. If anyone has standing to be outraged about an industry being strip mined to build AI tools, software developers have a reasonable claim.
And yet I am not here shouting AI bad. I use AI coding assistants for things like code reviews and unit tests, the repetitive boilerplate that frees me up to focus on architecture and the complex problem solving that actually requires expertise. I am not telling people without coding skills that they shouldn't use AI coding tools because it is doing me out of work right now. That is absurd. Fill your boots if it helps you, saves you money, and makes you more productive while helping you to learn. The same logic applies in music production.
Where This Is Actually Heading
The most interesting conversation about AI and music is not happening in most of these videos. It is not about whether Suno will replace human composers or whether AI mixing tools will put engineers out of work. It is about what a genuinely collaborative AI workflow inside a digital audio workstation could look like.
The current Suno model, prompt in, finished song out, is a dead end creatively. It produces output you cannot meaningfully revise, cannot surgically adjust, cannot use as a starting point for something more specific. That is not a collaboration. That is outsourcing.
The more interesting model already exists in primitive form in tools like Neutron's Mix Assistant. An AI system that lives inside your DAW, that has access to your actual plugins, your EQ, your compressor, your reverb, your limiter, and that can make targeted adjustments to those tools through human direction. You stay in control of the creative decisions. You tell the system what you need. The system adjusts the tools. You evaluate the result, keep what works, override what does not, and continue.
This is not replacing the mix engineer. It is giving the mix engineer, or the independent artist learning to become one, a more intelligent set of hands for the parts of the process that do not require inspiration. Could you automate this vocal so it sits above the mix consistently? Could you clean up the low mid buildup on the guitar around 300hz? Could you tighten the low end relationship between the kick and the bass? These are tasks with learnable rules. An AI assistant that can execute them on your behalf while you focus on the artistic decisions is not a threat to craft. It is the logical extension of every productivity tool that has ever existed in music production.
The version of this that is further out but clearly coming is something closer to a true collaborator. An AI that can engage with your musical ideas, suggest harmonic options, identify structural problems, propose solutions that you can then accept, modify, or reject. Not generating finished music on your behalf but participating in the process the way a knowledgeable collaborator would. That is a different and much more interesting tool than anything currently on the market. And it will require both musical knowledge to prompt effectively and artistic judgment to evaluate what comes back. Which means it will not replace musicians. It will make better equipped musicians more productive.
Do I Use AI In My Music?
It is a fair question and it deserves a direct answer.
I do not use generative music platforms like Suno or Udio. First of all, they were not trained ethically. In addition these tools lack the collaborative dimension that would make them useful to me. I cannot have a collaborative, iterative conversation with them about what the track needs. Until a genuinely collaborative AI assistant exists that lives inside my DAW, controls my actual plugins through my direction, and lets me accept, modify, or discard each suggestion on its merits, generative music tools are not part of my workflow. There is also a more basic problem: the audio quality reflects the training data, which is largely compressed MP3 sourced material. The artifacts alone make it unusable for anything I would want to release.
That said, I do use AI in two specific areas and I want to be transparent about both.
The first is voiceovers. Several of my projects include spoken dialogue and narration as part of the artistic concept. I write all of that dialogue myself. Every word, every character, every line is mine. I then use synthetic voices through ElevenLabs to bring that dialogue to life. I deliberately avoid using voices that approximate any recognisable public figure. I build each voice from scratch to find the right tone for the project. The reason I use synthetic voices rather than hired voice actors is the same reason that runs through everything on this page: my total streaming royalties across several years of releases amount to $8.29. I am not in a position to hire voice talent. My options are synthetic voices or no voices at all. Given that the dialogue serves the artistic vision and I am the one writing and curating every word of it, I do not consider this a compromise of the work.
The second is album artwork. Streaming platforms require album art. There is no option to upload music without it. The minimum specification is 3000 by 3000 pixels. To be completely honest, the artwork is not something I care deeply about as part of my artistic output. The music is the art. If streaming services allowed me to upload a plain coloured square, I would probably do that and move on. In retrospect I sometimes think I should have just assigned a different colour to each album and been done with it. But they require proper artwork, so I use AI image generation to produce something that fits the project without paying an artist hundreds of dollars per release for something that is essentially a platform requirement rather than a genuine artistic statement.
Is that morally problematic? I do not think so, and here is why. All of my music is available for free on Bandcamp. Name your price actually, with a minimum of zero dollars. If you feel like contributing to the cause you can, but there is no expectation for you to do so. No-one has chosen to contribute thus far and that's just fine. I am not generating revenue from which an artist could reasonably expect a share. The money simply does not exist. I am not depriving anyone of income they would otherwise have received, because I would not have commissioned artwork at all if AI generation were not an option. The alternative is not a paid artist. The alternative is a plain square or not releasing the music. If I were operating at the scale of a major label artist with significant revenue, the calculation would be different. At $8.29 in total royalties and zero Bandcamp income, it is not.
The broader principle is this. I use AI where it removes a genuine blocker that I have no realistic affordable alternative open to me, where the creative decisions remain entirely mine, and where the output serves the work rather than replacing it. I do not use it where the creative act itself is the point, because outsourcing that is not making music. It is ordering music.
A Closing Thought
Music has existed for most of human history outside of any commercial framework. The idea that music is something you make to earn a living is roughly a hundred years old and it has already largely stopped working for the vast majority of people who try it. The golden era of the commercial music industry, roughly 1970 to 1999, was an anomaly, a brief window when the economics of physical media, radio, and live performance aligned in a way that made it possible for large numbers of musicians to sustain themselves from their craft. That window has closed. Streaming finished what Napster started.
What remains, and what has always been the deeper truth about why people make music, is the transmission. The act of making something and sending it out into the world because you have something to say and you want someone to hear it. Not as a transaction. As a transmission.
AI does not change that. The tools change, as they always have. Multitrack recording changed it. The synthesizer changed it. The drum machine changed it. The DAW changed it. Each time, the argument was that something essential was being lost. Each time, the something essential turned out to be more resilient than predicted, because it was never in the tools. It was in the humans using them.
Use the tools that help you make the art you want to make. Reject the ones that do not. Learn the difference between accelerating your workflow and outsourcing your artistic vision. Your vision is important. And the messages in your transmissions should be yours.