Stable Audio
Stability AI’s audio-music-generation model — the open-weight wedge of the branch. Stable Audio 3.0 ships open weights under CreativeML Open RAIL-M (local deploy + fine-tune; use-restrictions apply) stable-audio-3.
Profile
- Instrumental, up to ~6 min, stereo 44.1kHz; text-to-audio + audio-to-audio conditioning.
- No vocals — sound design / background scoring focus, not radio songs.
- Trained on a licensed dataset (AudioSparx) → a clean ai-music-copyright posture.
- “Competent, useful” instrumental output; below pro-production polish, but fine-tunable and self-hostable.
Place in the wiki
The music-gen analog of fish-audio-s2-pro/kokoro (TTS) and whisper (STT): the open-weight option that trades top-tier polish/vocals for local control, fine-tuning, longer output, and clean licensing. Confirms the open-vs-closed split that speech-audio-ai sees across all three branches.
Related
audio-music-generation · ai-music-copyright · stable-audio-3 · suno · udio · open-weight-tts · speech-audio-ai