Wiro AI LogoWiro AI Logo
  • Models
  • Agents
  • — Learn
  • — Build Your Agent
  • — Pre-built Agents
  • Pricing
  • Blog
  • Docs
  • Sign In
  • Sign Up
Wiro AI LogoWiro AI Logo
Models
Agents
Pricing
Blog
Docs
All Models
OverviewThe platform at a glance
LearnSkills, knowledge, guardrails
Build Your AgentPick skills, set tier, deploy
Pre-built AgentsBrowse the catalog
Sign In
Sign Up

Task History

Click to see output list
  • Runnings
  • Models
  • Trains
Projects
The list is empty
No results

You don't have task yet.

Go to Models

Explore

Featured Model
2
Video to Video

netflix/void-model

Netflix VOID Model removes objects from videos and rewrites shadows, reflections, and motion after the edit. Use it for VFX cleanup and counterfactual scene edits.

8
Fast Inference

pruna/p-video-avatar

Generate a lip-synced talking avatar video from a single portrait image plus a script or uploaded audio. Pick a voice, language, and 720p or 1080p output.

Image to Video

alibaba/happyhorse 1.0 reference

Edit a short MP4/MOV clip or generate a new 720p/1080p video from up to 9 reference images and a detailed prompt, with optional watermark and seed control.
Point: 5 (17 users)
15
Text to Video

alibaba/happyhorse 1.0

Alibaba’s HappyHorse 1.0 generates 5–15s videos from text or a first-frame image. Choose 720p or 1080p, aspect ratio, seed, and watermark.
Point: 4.4 (19 users)
18
Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.
Point: 5 (14 users)
12

Recently Added

Popular Models
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.75 (88 users)
75
Text to Video

klingai/kling-v3

Generate high-quality videos from text prompts using Kling V3. Supports custom frames, duration, and aspect ratios.
Point: 4.86 (71 users)
69
Image to Video

klingai/kling-v2.6-motion-control

Generates videos from images and reference videos with motion control. Supports custom prompts and character orientation settings.
Point: 4.8 (60 users)
61
Text to Video

ByteDance/seedance-pro-v1.5-uncensored

Seedance Pro v1.5 Uncensored by ByteDance generates short videos from text with optional native audio, strong prompt following, and accurate lip-sync for dialogue scenes.
Point: 4.74 (58 users)
60
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.765853658536585 (41 users)
37
Fast Inference

google/nano-banana-pro

Google's Gemini 3 Pro Image Preview, also known as Nano Banana, model for text-to-image and image-to-image generation.
Point: 4.84 (25 users)
24
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.67 (22 users)
16
Text to Video

alibaba/happyhorse 1.0

Alibaba’s HappyHorse 1.0 generates 5–15s videos from text or a first-frame image. Choose 720p or 1080p, aspect ratio, seed, and watermark.
Point: 4.4 (19 users)
18
Image to Video

alibaba/happyhorse 1.0 reference

Edit a short MP4/MOV clip or generate a new 720p/1080p video from up to 9 reference images and a detailed prompt, with optional watermark and seed control.
Point: 5 (17 users)
15
Partner LLM

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.
Point: 4.83 (15 users)
10
Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.
Point: 5 (14 users)
12
Image to Video

alibaba/Wan 2.7 Reference

Generate 2-10s videos in 720p or 1080p from reference images or a short clip. Wan 2.7 R2V keeps subject identity while following your prompt.
Point: 5 (14 users)
13
Text to Video

alibaba/wan 2.7

Generate videos from text prompts using Alibaba's WAN 2.7 model. Supports both text-to-video and image-to-video conversion with customizable duration and resolution.
Point: 5 (14 users)
13
Video to Video

ByteDance/Seedance 2.0 Fast Reference

Seedance 2.0 Fast R2V by ByteDance is the faster and lower-cost variant of Seedance 2.0 R2V, accepting reference images, videos and audios at 480p / 720p (1080p not supported).
Point: 5 (13 users)
8
Image to Video

klingai/kling-v3-motion-control

Transform images into dynamic videos with motion control using Kling V3. Adjust character movement and video quality with customizable parameters.
Point: 5 (13 users)
10
Text to Video

ByteDance/Seedance 2.0

Seedance 2.0 by ByteDance generates short MP4 videos from text, with optional image and audio references. It supports synced audio, stable motion, and cinematic camera direction.
Point: 5 (12 users)
12
Image to Video

PixVerse/image-to-video-v5

Access the PixVerse V5 API for fast image to video generation with simple integration and clear pricing. Produce high quality motion, stable scenes, and accurate prompt control from a single image.
Point: 4.67 (12 users)
8
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Text to Video

alibaba/wan 2.6

Generates videos from text prompts or images using Alibaba's Wan 2.6 model. Supports multiple modes and customization options.
Point: 5 (11 users)
10
Text to Video

openai/sora-2-pro

OpenAI's Sora 2 Pro model for text-to-video or image-to-video generation.
Point: 4.55 (11 users)
9
Generate Videos
Video to Video

netflix/void-model

Netflix VOID Model removes objects from videos and rewrites shadows, reflections, and motion after the edit. Use it for VFX cleanup and counterfactual scene edits.
Point: 4 (2 users)
2
Fast Inference

pruna/p-video-avatar

Generate a lip-synced talking avatar video from a single portrait image plus a script or uploaded audio. Pick a voice, language, and 720p or 1080p output.
Point: 4.5 (9 users)
8
Image to Video

alibaba/happyhorse 1.0 reference

Edit a short MP4/MOV clip or generate a new 720p/1080p video from up to 9 reference images and a detailed prompt, with optional watermark and seed control.
Point: 5 (17 users)
15
Text to Video

alibaba/happyhorse 1.0

Alibaba’s HappyHorse 1.0 generates 5–15s videos from text or a first-frame image. Choose 720p or 1080p, aspect ratio, seed, and watermark.
Point: 4.4 (19 users)
18
Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.
Point: 5 (14 users)
12
Social Media & Viral

wiro/Transformation Effect

Henshin, outfit cycles, animal/object morphs, age progressions and era shifts generated from a single portrait. 121 scenarios.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Supernatural Presence Effect

Analog horror, ghost and paranormal cinematic short videos from a single portrait. 67 scenarios across uncanny domestic, found-footage and liminal spaces.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Superhero Powers Effect

Cinematic superhero power-up, aura and transformation videos generated from a single portrait. 35 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Sports Extreme Effect

Cinematic extreme sports videos generated from a single portrait. 27 scenarios across parkour, BMX, surf, ATV, snowboard, MotoGP and lunar skiing.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Scale Shift Effect

Cosmic-to-micro scale-shift videos generated from a single portrait. 31 scenarios across galaxy zooms, ant scale, giant stomp and ocean trench dives.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Retro Period Aesthetic Effect

Era-shift cinematic short videos from a single portrait. 33 retro and period-aesthetic scenarios across a century of cinema styles.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Reality Warp Effect

Portals, dimensional rifts and reality-warp videos generated from a single portrait. 43 scenarios across sci-fi, surreal and fantasy.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Effect

Cinematic product, food and fashion short videos from one photo. 28 scenarios with motion design, slow-mo physics and stylized lighting.
Point: 0 (0 users)
0
Social Media & Viral

wiro/POV FPV Effect

First-person and FPV cinematic adventure videos generated from a single portrait. 61 scenarios across dragon flight, wingsuit, animal POV, parkour, FPV drone racing and lunar handheld footage.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Movie Scene Homage Effect

Iconic movie-scene homage videos generated from a single portrait. 24 scenarios across sci-fi action, war epic, noir, samurai and adventure.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Epic Disaster Effect

Hyperrealistic cinematic disaster videos from a single portrait photo. 14 scenarios across meteor strikes, supervolcanoes, earthquakes, tsunamis and more.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Epic Combat Effect

Star in epic cinematic battle videos generated from a single portrait. 92 scenarios across kaiju, fantasy, mecha, samurai, wuxia and modern action.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Emotional Connection Effect

Cinematic emotional and intimate short films from a single portrait photo. 35 scenarios across cozy, romantic and dramatic moments.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Elemental Command Effect

Bend fire, ice, wind, lightning, time and earth in cinematic VFX shorts generated from a single portrait. 13 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Dance Performance Effect

Stage your portrait as a stylized dance or musical performance video. 25 scenarios across global cultures and stage styles.
Point: 0 (0 users)
0
Generate Images
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.67 (22 users)
16
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 3 (9 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.75 (88 users)
75
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.63 (10 users)
10
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.765853658536585 (41 users)
37
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (1 users)
1
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

wiro/FLUX.2-dev-turbo

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.
Point: 0 (0 users)
1
Text to Image

meituan-longcat/LongCat-Image-Edit

LongCat-Image-Edit, the image editing version of LongCat-Image.
Point: 5 (1 users)
0
Text to Image

meituan-longcat/LongCat-Image

LongCat-Image is a 6B-parameter model built for high-quality image generation, delivering strong multilingual text rendering, realistic visuals, and efficient deployment.
Point: 0 (0 users)
0
Text to Image

ByteDance/seedream-v4-5

Access the Seedream 4.5 API for fast high resolution image generation and editing. Simple integration, clear pricing, and support for text to image, multi image input, and advanced creative workflows.
Point: 5 (3 users)
3
Fast Inference

wiro/FLUX.2-dev

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.
Point: 5 (2 users)
4
Text to Image

black-forest-labs/flux-2-flex

Flux 2 Flex model
Point: 4.5 (2 users)
1
Edit Images
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.67 (22 users)
16
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 3 (9 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.75 (88 users)
75
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.63 (10 users)
10
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.765853658536585 (41 users)
37
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (1 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4.5 (2 users)
2
Text to Image

ByteDance/seedream-v4-5

Access the Seedream 4.5 API for fast high resolution image generation and editing. Simple integration, clear pricing, and support for text to image, multi image input, and advanced creative workflows.
Point: 5 (3 users)
3
Text to Image

black-forest-labs/flux-2-flex

Flux 2 Flex model
Point: 4.5 (2 users)
1
Text to Image

black-forest-labs/flux-2-pro

Flux 2 Pro model
Point: 3.5 (2 users)
2
Generate Text
Fast Inference

nvidia/parakeet-tdt-0.6b-v3

Multilingual speech-to-text for 25 European languages with auto language detection, punctuation, capitalization, and optional timestamps.
Point: 5 (5 users)
8
Fast Inference

CohereLabs/cohere-transcribe-03-2026

CohereLabs cohere-transcribe-03-2026 is a 2B Conformer speech-to-text model for 14 languages. It creates accurate transcripts for meetings, calls, and audio archives.
Point: 5 (4 users)
7
Fast Inference

Qwen/Qwen3-ASR-1.7B

A lightweight speech-to-text model optimized for fast inference. Converts audio input into text with support for multiple languages.
Point: 0 (0 users)
0
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Speech to Text

nvidia/nemotron

Nemotron-Speech-Streaming-En-0.6b is the first unified model in the Nemotron Speech family, engineered to deliver high-quality English transcription across both low-latency streaming and high-throughput batch workloads. The model natively supports punctuation and capitalization and offers runtime flexibility with configurable chunk sizes, including 80ms, 160ms, 560ms, and 1120ms.
Point: 5 (1 users)
0
Speech to Text

elevenlabs/speech-to-text

Speech to text model from ElevenLabs
Point: 0 (0 users)
0
Image to Text

moondream3-preview/detect

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/point

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/caption

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/query

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Speech to Text

openai/whisper-large-v3-turbo-turkish

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.
Point: 5 (1 users)
0
Video to Text

wiro/video-nsfw-detection

NSFW video detection automatically analyzes video content to identify inappropriate or explicit material, ensuring compliance with content policies and a safe viewing environment.
Point: 5 (1 users)
0
nocover
Video to Text

DAMO-NLP-SG/VideoLLaMA3-2B

VideoLLaMA3-2B is a model designed for video understanding.
Point: 0 (0 users)
0
nocover
Image to Text

DAMO-NLP-SG/VideoLLaMA3-2B-Image

VideoLLaMA3-2B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Image to Text

wiro/VideoLLaMA3-7B-Image

VideoLLaMA3-7B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Video to Text

wiro/VideoLLaMA3-7B

VideoLLaMA3-7B is a model designed for video understanding.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip2-flan-t5-xl

BLIP-2 creates captions or detailed descriptions for images. This is BLIP-2 model, leveraging Flan T5-xl.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip-image-captioning-large

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning large model.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip-image-captioning-base

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning base model.
Point: 0 (0 users)
0
Video to Text

wiro/ask-video

Convert the video data into textual descriptions.
Point: 4 (1 users)
0
Generate 3D
3D Generation

tencent/HY-World-2.0-World-Reconstruction

Reconstruct a 3D scene from multi-view photos or a short video. Produces 3D Gaussian splats with depth maps and cameras, plus a colored point cloud.
Point: 5 (1 users)
1
3D Generation

microsoft/Trellis-2

Convert images into detailed 3D meshes using Microsoft's Trellis-2 model. Supports various resolutions and customization options.
Point: 5 (1 users)
1
3D Generation

tencent/Hunyuan3D-2.1

Generate 3D models from images using the Hunyuan3D-2.1 AI tool. Transform 2D inputs into detailed 3D assets for design and development.
Point: 5 (2 users)
1
Generate Audio
Fast Inference

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.
Point: 0 (0 users)
1
Fast Inference

k2-fsa/OmniVoice

OmniVoice by k2-fsa generates 24 kHz speech from text in 600+ languages. Clone a speaker from a short reference clip or design a new voice from attributes.
Point: 0 (0 users)
0
Fast Inference

humeai/tada-3b-ml

TADA is a unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment. By leveraging a novel tokenizer and architectural design, TADA achieves high-fidelity synthesis and generation with a fraction of the computational overhead required by traditional models.
Point: 0 (0 users)
0
Fast Inference

fishaudio/s2-pro

Generates high-quality speech from text using advanced TTS technology with support for voice cloning and multi-speaker synthesis.
Point: 0 (0 users)
0
Fast Inference

nineninesix/kani-tts-2-en

Generates natural-sounding speech from text with support for multi-speaker voice cloning and fast inference capabilities.
Point: 0 (0 users)
0
Text to Speech

resemble-ai/chatterbox-turbo

The fastest open source TTS model without sacrificing quality.
Point: 0 (0 users)
0
Text to Speech

resemble-ai/chatterbox-multilingual

Generate expressive, natural speech in 23 languages. Features instant voice cloning from short audio, emotion control, and seamless cross-language voice transfer.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a production long-form dialogue model for expressive multi-speaker conversational audio at scale. It supports long-duration continuity, turn-taking control, and zero-shot voice cloning from short references for podcasts, audiobooks, commentary, dubbing, and entertainment dialogue.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

Qwen/Qwen3-TTS-12Hz-1.7B

A fast inference text-to-speech model optimized for real-time audio generation with multi-language support.
Point: 0 (0 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
Fast Inference

microsoft/VibeVoice-Realtime

VibeVoice-Realtime is a lightweight real‑time text-to-speech model supporting streaming text input and robust long-form speech generation.
Point: 5 (1 users)
1
Text to Speech

openbmb/VoxCPM

Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Point: 0 (0 users)
0
Text to Speech

google/gemini-2.5-tts

Google's Gemini 2.5 Flash Text To Speech Preview model
Point: 5 (1 users)
5
Text to Speech

elevenlabs/text-to-speech

Text to speech model from ElevenLabs
Point: 5 (1 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Speech to Speech

wiro/RVC-Voice-Clone-Youtube

Clone any voice and cover any YouTube song with it. This tool lets you apply a custom-trained RVC voice model to recreate any YouTube song in that voice.
Point: 5 (1 users)
0
Generate Music
Music Generation

tencent-ailab/SongGeneration 2

Song Generation 2 generates full songs with vocals and instrumentals from lyrics — supports 14 genres, reference audio cloning, and separate track output (vocal/bgm/mix).
Point: 5 (3 users)
3
Video to Video

wiro/video-background-music-v2

It turns any video into a cinematic experience by generating AI-powered instrumental soundtracks that match its mood.
Point: 4 (1 users)
3
Music Generation

ACE-Step/text-to-song-ACE-Step1.5

ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer hardware.
Point: 5 (4 users)
2
Social Media & Viral

wiro/Song Frame

SongFrame places you into a cinematic world, pulls the soundtrack directly from your YouTube link, and fuses everything into a polished video — effortless, emotional, and instantly shareable.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Video to Video

wiro/video-background-music-gen

It’s a tool that gets your video, creates original music to match its vibe, and seamlessly adds it back as the perfect soundtrack.
Point: 0 (0 users)
0
Music Generation

ACE-Step/image-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from image in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 0 (0 users)
0
Music Generation

ACE-Step/text-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from text prompts in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 5 (1 users)
0
Music Generation

wiro/image-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your image — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/image-to-song-YuE

Turn your image into a full song with vocals and instrumental music in seconds. Just upload your image and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your custom lyrics — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-YuE

Turn your lyrics into a full song with vocals and instrumental music in seconds. Just enter your lyrics and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/music_gen

MusicGen is a text-to-music model capable of generating high-quality music samples.
Point: 0 (0 users)
0
Realtime Stream
Fast Inference

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.
Point: 0 (0 users)
1
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
LLM & Chat
Chat

Qwen/Qwen3.6-27B

Qwen3.6-27B is a 27B dense vision-language model from Qwen for agentic coding and reasoning. It supports 262K-token context and optional thinking traces in outputs.
Point: 5 (5 users)
6
LLM

xai/grok-4-1-fast

Grok 4.1 Fast by xAI is a long-context chat model built for tool calling and agent workflows. It can analyze images and return grounded text answers with sources.
Point: 5 (5 users)
7
LLM

xai/grok-4-20

Grok 4.20 is xAI’s flagship text model with a 2M-token context window, tool calling, and structured JSON outputs. Attach images for vision Q&A and OCR.
Point: 5 (3 users)
5
Chat

Qwen/Qwen3.5-4B-heretic

Decensored Qwen3.5-4B checkpoint for long-context chat, coding, and analysis. Supports optional thinking traces and sampling controls for output style.
Point: 5 (4 users)
6
Chat

Qwen/Qwen3.5-9B-heretic

A large language model optimized for chat interactions and logical reasoning tasks. Designed for developers and researchers.
Point: 5 (2 users)
3
Chat

Qwen/Qwen3.5-4B

A compact yet capable LLM optimized for chat interactions and logical reasoning tasks. Designed for efficient deployment and accurate responses.
Point: 5 (7 users)
7
Chat

Qwen/Qwen3.5-9B

A dense 9-billion parameter language model optimized for chat and reasoning tasks. Designed for efficient deployment and high-quality responses.
Point: 5 (3 users)
4
Chat

Qwen/Qwen3.5-27B-heretic

A large language model optimized for chat interactions and reasoning tasks, designed for advanced users seeking powerful natural language processing capabilities.
Point: 5 (2 users)
3
LLM

bytedance/seed-v2-mini

A lightweight language model optimized for efficient inference and versatile applications in natural language processing tasks.
Point: 5 (3 users)
5
LLM

bytedance/seed-v2-lite

A lightweight text-to-image generation model optimized for visual content creation using prompts and optional images.
Point: 5 (2 users)
4
Chat

Qwen/Qwen3.5-27B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.
Point: 5 (1 users)
1
Partner LLM

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.
Point: 4.83 (15 users)
10
Chat

zai-org/GLM-4.7-Flash

A chat-based language model optimized for conversational AI tasks with support for custom prompts and session management.
Point: 5 (1 users)
1
Partner LLM

openai/gpt-5-nano

A compact AI model optimized for efficient processing of complex prompts and multi-modal inputs with support for images and structured data.
Point: 4.75 (4 users)
4
Partner LLM

openai/gpt-5.2

AI model for processing text prompts and image inputs with advanced reasoning capabilities. Supports multi-modal inputs and customizable responses.
Point: 3.5 (2 users)
1
Partner LLM

openai/gpt-5-mini

A compact AI model optimized for text generation tasks, designed for efficient processing and accurate responses.
Point: 0 (0 users)
0
Partner LLM

google/gemini-3-flash

gemini-3-flash
Point: 5 (3 users)
3
Partner LLM

google/gemini-2-5-flash

gemini-2-5-flash
Point: 3 (1 users)
0
Chat

wiro/rag-chat-github

Instantly retrieve and analyze content from any GitHub repository. Select your LLM model, extract relevant information from codebases or documentation, and generate context-aware responses with ease!
Point: 0 (0 users)
0
Chat

wiro/rag-chat-youtube

Extract insights directly from YouTube videos by simply providing a URL. Choose your LLM model, access video transcripts or summaries, and create contextually rich conversations effortlessly!
Point: 5 (1 users)
2
AI Models for E-commerce
Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.
Point: 5 (3 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.
Point: 4.8 (5 users)
3
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4.5 (2 users)
2
Social Media & Viral

wiro/Animated Logo

Transform static logos into stunning animated videos with 36+ creative presets. Choose from scenes like Times Square billboards, Parisian storefronts, coffee art, neon signs, and luxury showcases.
Point: 5 (1 users)
1
Social Media & Viral

wiro/3D Text Animations

Create stunning 3D animated text videos with 22+ creative presets. Transform any text into balloon letters, neon signs, candy typography, cloud formations, and cinematic motion effects.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Caption

Combine product images with custom captions into stunning animated video ads. 42 creative presets featuring sales promotions, seasonal themes, and dynamic text animations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Logo

Combine product images with logos into stunning animated video ads. 12 creative presets featuring storefronts, billboards, city banners, and surreal brand presentations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads

Transform product images into stunning animated video ads with 100+ creative presets. Choose from effects like water splashes, scene transitions, surreal staging, and seasonal themes.
Point: 5 (3 users)
0
Ecommerce

wiro/camera-angle-editor

wiro/camera-angle-editor is an advanced AI tool that instantly changes the camera perspective and angle of any existing image. Leveraging sophisticated spatial reconstruction, it eliminates the need for reshoots by synthesizing photorealistic new viewpoints, making it the fastest way for creators to maximize the versatility of their visual content.
Point: 0 (0 users)
0
Ecommerce

wiro/Product Photoshoot

Save time and production costs with AI Product Photoshoot. Generate polished product images featuring adaptive lighting, varied angles, and contextual scenes. Ideal for online stores, marketing teams, and agencies looking to accelerate content creation with consistent, high-quality visuals.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Virtual Try-On

Integrate the Wiro Virtual Try-On API to deliver hyper-realistic apparel fitting directly in your web, mobile, or SaaS platform. Generate lifelike visuals of users wearing new garments with precise texture mapping, pose alignment, and fabric simulation — ideal for online retail and fashion tech solutions.
Point: 5 (3 users)
1
Ecommerce

wiro/text-removal

This AI model intelligently removes unwanted text from any image, seamlessly filling in the background.
Point: 0 (0 users)
0
Ecommerce

wiro/remove-background

AI-powered background removal tool that automatically removes image backgrounds. Perfect for e-commerce product photos and quick image editing.
Point: 4.5 (1 users)
0
AI Models for Social Media Creators
Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.
Point: 5 (14 users)
12
Social Media & Viral

wiro/Transformation Effect

Henshin, outfit cycles, animal/object morphs, age progressions and era shifts generated from a single portrait. 121 scenarios.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Supernatural Presence Effect

Analog horror, ghost and paranormal cinematic short videos from a single portrait. 67 scenarios across uncanny domestic, found-footage and liminal spaces.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Superhero Powers Effect

Cinematic superhero power-up, aura and transformation videos generated from a single portrait. 35 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Sports Extreme Effect

Cinematic extreme sports videos generated from a single portrait. 27 scenarios across parkour, BMX, surf, ATV, snowboard, MotoGP and lunar skiing.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Scale Shift Effect

Cosmic-to-micro scale-shift videos generated from a single portrait. 31 scenarios across galaxy zooms, ant scale, giant stomp and ocean trench dives.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Retro Period Aesthetic Effect

Era-shift cinematic short videos from a single portrait. 33 retro and period-aesthetic scenarios across a century of cinema styles.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Reality Warp Effect

Portals, dimensional rifts and reality-warp videos generated from a single portrait. 43 scenarios across sci-fi, surreal and fantasy.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Effect

Cinematic product, food and fashion short videos from one photo. 28 scenarios with motion design, slow-mo physics and stylized lighting.
Point: 0 (0 users)
0
Social Media & Viral

wiro/POV FPV Effect

First-person and FPV cinematic adventure videos generated from a single portrait. 61 scenarios across dragon flight, wingsuit, animal POV, parkour, FPV drone racing and lunar handheld footage.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Movie Scene Homage Effect

Iconic movie-scene homage videos generated from a single portrait. 24 scenarios across sci-fi action, war epic, noir, samurai and adventure.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Epic Disaster Effect

Hyperrealistic cinematic disaster videos from a single portrait photo. 14 scenarios across meteor strikes, supervolcanoes, earthquakes, tsunamis and more.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Epic Combat Effect

Star in epic cinematic battle videos generated from a single portrait. 92 scenarios across kaiju, fantasy, mecha, samurai, wuxia and modern action.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Emotional Connection Effect

Cinematic emotional and intimate short films from a single portrait photo. 35 scenarios across cozy, romantic and dramatic moments.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Elemental Command Effect

Bend fire, ice, wind, lightning, time and earth in cinematic VFX shorts generated from a single portrait. 13 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Dance Performance Effect

Stage your portrait as a stylized dance or musical performance video. 25 scenarios across global cultures and stage styles.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Comedy Effect

Cinematic comedy reveals, twists and pet-life micro-dramas generated from a single portrait photo. 18 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Antigravity Effect

Defy gravity in cinematic levitation and zero-gravity videos generated from a single portrait photo. 12 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Anime Manga Style Effect

Restyle your portrait into anime, manga, watercolor ink and cel-animation short films. 48 scenarios across action, romance, fantasy and stylized art.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Animated Short Effect

Turn a single portrait photo into a stylized 3D animated short film with cinematic comedy timing and family-film CG aesthetics. 17 scenarios.
Point: 0 (0 users)
0
Wiro AI LogoWiro AI LogoLogo of nvidia programLogo of nvidia program
Wiro AI brings machine learning easily accessible to all in the cloud.
  • WIRO
  • About
  • Blog
  • Careers
  • Contact
  • Product
  • Models
  • Agents Platform
  • Pricing
  • Changelog
  • Status
  • FAQ
  • Getting Started
  • Introduction
  • Authentication
  • Projects
  • Code Examples
  • Wiro MCP Server
  • Self-Hosted MCP
  • n8n Integration
  • LLMs.txt
  • API Reference
  • Models
  • Run a Model
  • Model Parameters
  • Tasks
  • LLM & Chat Streaming
  • WebSocket
  • Realtime Voice Conversation
  • Files

2026 © Wiro.ai | Terms of Service & Privacy Policy