FishAudio S2 Pro vs Qwen3-TTS: six short audio tests compare clarity, timing, and prosody. Each test uses the same script so results remain comparable.
Model links
Test setup
| Output format | MP3 |
| Sampling | model default |
| Notes | Same prompt used for both models; FishAudio used inline prosody tags where noted. |
6 Audio Tests
Test 1 — Account digits
Prompt: Hello. Sorry about the issue. To fix this fast, please confirm the last four digits of the account number: 3 4 2 9.
Test 2 — Product ad
Prompt: Quick update. NovaCell Pro just dropped. Ultra thin. No buttons. It unlocks when you look at it. Want to see the colors.
Test 3 — Whisper and atmosphere
Prompt: Tonight the city sounded like rain on glass. The train doors closed. The lights flickered. A message appeared: DO NOT RUN. Nobody moved.
Test 4 — Bilingual short demo
Prompt: Merhaba. Today is a quick demo. First, say hello. Then say: WIRO API. Then add a warm goodbye in Turkish: gorusuruz.
Test 5 — Short dialogue
Prompt: Are we recording. Yes. Keep it short and clear. Got it.
Test 6 — Technical explainer
Prompt: An API gateway sits in front of services. It checks auth. It applies rate limits. It routes traffic. That is it. Keep the rules boring.
Quick comparison
| Model | Best for | Notes |
|---|---|---|
| FishAudio S2 Pro | Expressive TTS, emotion tags, voice cloning | Strong prosody control and expressive range |
| Qwen3-TTS 12Hz 1.7B | Fast, multilingual TTS, clean narration | Good for neutral narration and multilingual support |