{"id":2431,"date":"2026-06-09T09:00:00","date_gmt":"2026-06-09T09:00:00","guid":{"rendered":"https:\/\/wiro.ai\/blog\/?p=2431"},"modified":"2026-06-03T00:50:17","modified_gmt":"2026-06-03T00:50:17","slug":"chatterbox-multilingual-voice-cloning-in-23-languages","status":"publish","type":"post","link":"https:\/\/wiro.ai\/blog\/chatterbox-multilingual-voice-cloning-in-23-languages\/","title":{"rendered":"Chatterbox Multilingual: Voice Cloning in 23 Languages"},"content":{"rendered":"<p>Chatterbox Multilingual is most interesting when the same reference voice has to survive language transfer, pacing changes, and expressive settings without falling apart.<\/p>\n<h2>Chatterbox Multilingual: what it does<\/h2>\n<p>Chatterbox Multilingual generates speech in 23 languages and can clone a voice from a short reference clip. It also exposes two knobs that matter in practice: <strong>exaggeration<\/strong> for expressiveness, and <strong>cfg_weight<\/strong> for guidance and pacing. For cross-language transfer, setting cfg_weight to 0 can help reduce the reference accent bleeding into the target language.<\/p>\n<h2>Model<\/h2>\n<ul>\n<li><a href=\"https:\/\/wiro.ai\/models\/resemble-ai\/chatterbox-multilingual\">resemble-ai\/chatterbox-multilingual<\/a><\/li>\n<\/ul>\n<h2>Test rules<\/h2>\n<ul>\n<li>6 short runs<\/li>\n<li>One reference voice clip reused across all tests (voice cloning)<\/li>\n<li>English tests use cfg_weight between 0.3 and 0.5<\/li>\n<li>Non-English tests use cfg_weight=0 (language transfer setting)<\/li>\n<li>Outputs published as-is<\/li>\n<\/ul>\n<h2>Hero image<\/h2>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-cover.jpg\" alt=\"Cover image for the Chatterbox Multilingual post\" \/><figcaption>Prompt: Modern blog cover based on the input image as background. Create a clean stylized audio waveform scene with soft bokeh, dark gradient overlay for contrast. Add three lines of left aligned title text centered vertically. Line 1 huge bold serif Chatterbox. Line 2 small thin italic serif Multilingual. Line 3 medium bold serif 23 Languages. White text with drop shadow. No logos. No extra text.<\/figcaption><\/figure>\n<h2>Results (6 tests)<\/h2>\n<h3>Test 1: English support script (neutral)<\/h3>\n<p><strong>Text:<\/strong> Thanks for calling support. Please confirm the last four digits of the order number. Then say the delivery city.<\/p>\n<p><strong>Settings:<\/strong> language=en, exaggeration=0.5, cfg_weight=0.5, temperature=0.8<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-1.png\" alt=\"Waveform preview for Test 1\" \/><figcaption>Prompt: Thanks for calling support. Please confirm the last four digits of the order number. Then say the delivery city.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-1-en-support.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h3>Test 2: English podcast intro pacing<\/h3>\n<p><strong>Text:<\/strong> Welcome back to the show. Today: why latency makes voice apps feel broken. Three quick points: timing, pauses, and turn taking.<\/p>\n<p><strong>Settings:<\/strong> language=en, exaggeration=0.5, cfg_weight=0.4, temperature=0.8<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-2.png\" alt=\"Waveform preview for Test 2\" \/><figcaption>Prompt: Welcome back to the show. Today: why latency makes voice apps feel broken. Three quick points: timing, pauses, and turn taking.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-2-en-podcast.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h3>Test 3: English emotional shift (higher exaggeration)<\/h3>\n<p><strong>Text:<\/strong> I should have called sooner. The silence made things worse. Please pause. Then say this line slowly: I am sorry.<\/p>\n<p><strong>Settings:<\/strong> language=en, exaggeration=1.2, cfg_weight=0.3, temperature=0.9<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-3.png\" alt=\"Waveform preview for Test 3\" \/><figcaption>Prompt: I should have called sooner. The silence made things worse. Please pause. Then say this line slowly: I am sorry.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-3-en-emotion.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h3>Test 4: Turkish localization line (cfg_weight=0)<\/h3>\n<p><strong>Text:<\/strong> Merhaba. Bu test ayni sesi Turkce konusturuyor. Lutfen siparis numarasinin son dort hanesini soyle. Sonra teslimat sehrini soyle.<\/p>\n<p><strong>Settings:<\/strong> language=tr, exaggeration=0.6, cfg_weight=0, temperature=0.8<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-4.png\" alt=\"Waveform preview for Test 4\" \/><figcaption>Prompt: Merhaba. Bu test ayni sesi Turkce konusturuyor. Lutfen siparis numarasinin son dort hanesini soyle. Sonra teslimat sehrini soyle.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-4-tr.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h3>Test 5: Japanese support line (cfg_weight=0)<\/h3>\n<p><strong>Text:<\/strong> \u3053\u3093\u306b\u3061\u306f. \u3053\u306e\u30c6\u30b9\u30c8\u306f\u540c\u3058\u58f0\u3067\u65e5\u672c\u8a9e\u3092\u8a71\u3057\u307e\u3059. \u6ce8\u6587\u756a\u53f7\u306e\u4e0b4\u3051\u305f\u3092\u8a00\u3063\u3066\u304f\u3060\u3055\u3044. \u305d\u308c\u304b\u3089\u914d\u9054\u5148\u306e\u90fd\u5e02\u3092\u8a00\u3063\u3066\u304f\u3060\u3055\u3044.<\/p>\n<p><strong>Settings:<\/strong> language=ja, exaggeration=0.55, cfg_weight=0, temperature=0.8<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-5.png\" alt=\"Waveform preview for Test 5\" \/><figcaption>Prompt: \u3053\u3093\u306b\u3061\u306f. \u3053\u306e\u30c6\u30b9\u30c8\u306f\u540c\u3058\u58f0\u3067\u65e5\u672c\u8a9e\u3092\u8a71\u3057\u307e\u3059. \u6ce8\u6587\u756a\u53f7\u306e\u4e0b4\u3051\u305f\u3092\u8a00\u3063\u3066\u304f\u3060\u3055\u3044. \u305d\u308c\u304b\u3089\u914d\u9054\u5148\u306e\u90fd\u5e02\u3092\u8a00\u3063\u3066\u304f\u3060\u3055\u3044.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-5-ja.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h3>Test 6: Spanish support line (cfg_weight=0)<\/h3>\n<p><strong>Text:<\/strong> Hola. Este test usa la misma voz en espanol. Di los ultimos cuatro digitos del pedido. Luego di la ciudad de entrega.<\/p>\n<p><strong>Settings:<\/strong> language=es, exaggeration=0.55, cfg_weight=0, temperature=0.8<\/p>\n<figure>\n  <img decoding=\"async\" src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-wave-6.png\" alt=\"Waveform preview for Test 6\" \/><figcaption>Prompt: Hola. Este test usa la misma voz en espanol. Di los ultimos cuatro digitos del pedido. Luego di la ciudad de entrega.<\/figcaption><\/figure>\n<p><audio controls preload=\"none\" style=\"width:100%\"><source src=\"https:\/\/wiro.ai\/blog\/wp-content\/uploads\/2026\/04\/chatterbox-multilingual-test-6-es.mp3\" type=\"audio\/mpeg\" \/><\/audio><\/p>\n<h2>Speed snapshot (task elapsed time)<\/h2>\n<table>\n<thead>\n<tr>\n<th>Test<\/th>\n<th>Elapsed (s)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>44<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>10<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>10<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>9<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>10<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Takeaways<\/h2>\n<ul>\n<li>cfg_weight and exaggeration change the feel quickly. Small adjustments matter.<\/li>\n<li>For cross-language voice transfer, cfg_weight=0 offers a clean baseline to judge accent carryover.<\/li>\n<li>Short scripts make it easier to spot pronunciation issues, pacing drift, and number reading.<\/li>\n<\/ul>\n<h2>Try it<\/h2>\n<p><a href=\"https:\/\/wiro.ai\/models\/resemble-ai\/chatterbox-multilingual\">Run Chatterbox Multilingual on Wiro<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Chatterbox Multilingual is most interesting when the same reference voice has to survive language transfer, pacing changes, and expressive settings without falling&hellip;<\/p>\n","protected":false},"author":4,"featured_media":2418,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[52],"tags":[166,95,207,62,68],"class_list":["post-2431","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-model-reviews","tag-chatterbox-multilingual","tag-multilingual","tag-speech-to-speech","tag-text-to-speech","tag-voice-clone"],"_links":{"self":[{"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/posts\/2431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/comments?post=2431"}],"version-history":[{"count":2,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/posts\/2431\/revisions"}],"predecessor-version":[{"id":2883,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/posts\/2431\/revisions\/2883"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/media\/2418"}],"wp:attachment":[{"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/media?parent=2431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/categories?post=2431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wiro.ai\/blog\/wp-json\/wp\/v2\/tags?post=2431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}