Stable Audio 3.0: Open-Weight Music Generation (MindStudio)
MindStudio’s write-up of stable-audio 3.0 (Stability AI) — the founding source on the open-weight corner of audio-music-generation, the music-gen analog of open-weight-tts.
What’s announced
- Open weights under CreativeML Open RAIL-M — download & run locally, fine-tune on custom data, no per-generation API fee; use-restrictions apply and commercial-at-scale may need licensing.
- Instrumental audio up to ~6 minutes, stereo 44.1kHz; text-to-audio with detailed prompts (genre/tempo/instrumentation/mood) + audio-to-audio conditioning from a reference.
- No vocals — a key limitation vs suno/udio.
- Trained on a licensed dataset (AudioSparx) — part of why its ai-music-copyright posture is cleaner than the litigated vocal-song generators.
Why it matters here
Stable Audio is the open-weight wedge in music generation: it trades vocal-song polish and onboarding for local deployment, fine-tuning, longer instrumental output, and a clean license — the exact open-vs-closed tradeoff this wiki tracks in open-weight-tts and speech-to-text, now appearing in the third branch. Best for sound design / background scoring, not radio songs.
Related
stable-audio · audio-music-generation · ai-music-copyright · speech-audio-ai · open-weight-tts