Stable Audio 2
Activestable-audio-2by stability.ai
Context: —
Text → AudioAudio → Audio
Updated·2026-03-14
Notes
Normalized from official pricing: Stable Audio 2 starts from 20 credits for up to 3 minutes of audio, with 1 credit = $0.01. Effective normalized rate shown here is $0.0667 per minute.
Pricing
Input ($/1M)
0.00
Output ($/1M)
0.07
Total ($/1M)
0.07
Est. total per 10k tokens: $0.0007
Cost calculator
Estimate cost for a single request.
Estimated total
$0.000133
Uses latest $/1M input + output pricing.
Price history
Based on recorded pricing snapshots (effectiveAt).
Other models from stability.ai
- Stable Diffusion 3.5 Flash (Input 0.00 / Output 0.03) compare
- Stable Diffusion 3.5 Large (Input 0.00 / Output 0.07) compare
- Stable Diffusion 3.5 Medium (Input 0.00 / Output 0.04) compare
- Stable Diffusion 3.5 Turbo (Input 0.00 / Output 0.04) compare
- Stable Fast 3D (Input 0.00 / Output 0.10) compare
- Stable Image Core (Input 0.00 / Output 0.03) compare
- Stable Image Ultra (Input 0.00 / Output 0.08) compare
- Stable Point Aware 3D (Input 0.00 / Output 0.04) compare
- Stable SDXL 1.0 (Input 0.00 / Output 0.01) compare
Similar models (🎧 Audio)
- Alibaba · Qwen TTS (Input 0.01 / Output —)
- Alibaba · Qwen TTS Realtime (Input 0.01 / Output —)
- Alibaba · Qwen3 ASR Flash Filetrans (Input 0.00 / Output —)
- Alibaba · Qwen3 ASR Flash Realtime (Input 0.01 / Output —)
- Alibaba · Qwen3 Livetranslate Flash Realtime (Input 0.01 / Output 0.01)
- ElevenLabs · Dubbing v1 (Input 0.00 / Output 0.30)
- ElevenLabs · Eleven English STS v2 (Input — / Output —)
- ElevenLabs · Eleven Flash v2 (Input — / Output —)
- ElevenLabs · Eleven Flash v2.5 (Input 0.00 / Output 0.01)
- ElevenLabs · Eleven Multilingual STS v2 (Input — / Output —)
- ElevenLabs · Eleven Multilingual TTV v2 (Input — / Output —)
- ElevenLabs · Eleven Multilingual v2 (Input 0.00 / Output 0.03)