Kling V3 Omni: 3 Sound-On Text-to-Video Tests (720p) - Wiro AI

Kling V3 Omni is a text-to-video model that can generate motion and sound from a single first-frame image. In this post, I ran three quick 5-second tests at 720p with sound on, using the same settings each time.

Model

Kling V3 Omni on Wiro

Settings used

Mode	std (720p)
Duration	5 seconds
Ratio	9:16
Sound	on
CFG scale	0.5

Results (3 tests)

Test 1: Mountain hike

First-frame input image for Kling V3 Omni test: Mountain hike — First frame

Prompt: Wide tracking shot from behind. The hiker from @image carefully hikes down the rocky mountain trail at golden hour. Loose gravel shifts under each step. The camera smoothly follows, slight parallax in foreground rocks, distant town and ocean in soft haze. Audio: crisp footstep crunch on rocks, gentle mountain wind.

Test 2: Convertible drive

First-frame input image for Kling V3 Omni test: Convertible drive — First frame

Prompt: Mid-shot tracking profile. The silver vintage convertible from @image drives smoothly along the overpass from right to left. Wheels spin realistically, sunlight glints on chrome, subtle motion blur on the road. Audio: low vintage engine purr, tires rolling on asphalt.

Test 3: Dog in snow

First-frame input image for Kling V3 Omni test: Dog in snow — First frame

Prompt: Close-up portrait shot. The golden retriever from @image sits in the snow and slowly tilts its head, blinking once, breath visible in the cold air. Very slow push-in toward the face, shallow depth of field. Audio: soft winter wind, gentle panting and a faint collar jingle.

Notes

Audio behaves best when the prompt names 1-2 clear sources (engine, footsteps, wind).
Keep camera direction simple for 5-second clips (tracking, push-in, slow pan).
If faces drift, reduce motion and avoid fast turns.

Model

Settings used

Results (3 tests)

Test 1: Mountain hike

Test 2: Convertible drive

Test 3: Dog in snow

Notes

Leave a Comment Cancel reply

Related Posts

Easy OCR: 5 Layout Tests

LTX-2.3: 5 Text-to-Video Tests at 1080p

Trellis-2: 3 Image-to-3D Tests

Stay in the Loop