Wiro AI LogoWiro AI Logo
  • Home
  • Dashboard
  • Models
  • Agents
  • Pricing
  • Blog
  • Documentation
  • Sign In
  • Sign Up
Wiro AI LogoWiro AI Logo
HomeDashboardModelsAgentsPricing
Blog
Documentation
Sign In
Sign Up

Task History

Click to see output list
  • Runnings
  • Models
  • Trains
Projects
The list is empty
No results

You don't have task yet.

Go to Models

Explore

5
Text to Video

ByteDance/seedance-2.0

Seedance 2.0 by ByteDance generates short MP4 videos from text, with optional image and audio references. It supports synced audio, stable motion, and cinematic camera direction.

Featured Model
0
Text to Image

baidu/ERNIE-Image-Turbo

A high-quality text-to-image model developed by Baidu that supports English, Chinese, and Japanese prompts, with automatic prompt expansion built in.

Fast Inference

nvidia/parakeet-tdt-0.6b-v3

Multilingual speech-to-text for 25 European languages with auto language detection, punctuation, capitalization, and optional timestamps.
Point: 5 (4 users)
7
Text to Image

baidu/ERNIE-Image

A high-quality text-to-image model developed by Baidu that supports English, Chinese, and Japanese prompts, with automatic prompt expansion built in.
Point: 0 (0 users)
0
LLM

xai/grok-4-1-fast

Grok 4.1 Fast by xAI is a long-context chat model built for tool calling and agent workflows. It can analyze images and return grounded text answers with sources.
Point: 5 (5 users)
7

Recently Added

Popular Models
Text to Video

klingai/kling-v3

Generate high-quality videos from text prompts using Kling V3. Supports custom frames, duration, and aspect ratios.
Point: 4.85 (69 users)
68
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.67 (66 users)
49
Image to Video

klingai/kling-v2.6-motion-control

Generates videos from images and reference videos with motion control. Supports custom prompts and character orientation settings.
Point: 4.8 (60 users)
61
Text to Video

ByteDance/seedance-pro-v1.5-uncensored

Seedance Pro v1.5 Uncensored by ByteDance generates short videos from text with optional native audio, strong prompt following, and accurate lip-sync for dialogue scenes.
Point: 4.74 (54 users)
56
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.73 (37 users)
32
Fast Inference

google/nano-banana-pro

Google's Gemini 3 Pro Image Preview, also known as Nano Banana, model for text-to-image and image-to-image generation.
Point: 4.83 (23 users)
22
Partner LLM

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.
Point: 4.83 (15 users)
10
Text to Video

alibaba/wan 2.7

Generate videos from text prompts using Alibaba's WAN 2.7 model. Supports both text-to-video and image-to-video conversion with customizable duration and resolution.
Point: 5 (14 users)
13
Image to Video

klingai/kling-v3-motion-control

Transform images into dynamic videos with motion control using Kling V3. Adjust character movement and video quality with customizable parameters.
Point: 5 (12 users)
10
Image to Video

PixVerse/image-to-video-v5

Access the PixVerse V5 API for fast image to video generation with simple integration and clear pricing. Produce high quality motion, stable scenes, and accurate prompt control from a single image.
Point: 4.67 (12 users)
8
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Text to Video

alibaba/wan 2.6

Generates videos from text prompts or images using Alibaba's Wan 2.6 model. Supports multiple modes and customization options.
Point: 5 (11 users)
10
Text to Video

klingai/kling-v3-omni

Generate high-quality videos from text prompts and images with Kling V3 Omni. Supports 720p and 1080p modes, image references, and customizable video editing.
Point: 4 (10 users)
7
Fast Inference

MiniMax/hailuo-2.3-Fast

Experience the MiniMax Hailuo 2.3 Fast API — optimized for low latency, stable motion, and scalable performance. Generate large animation sets or motion previews in seconds without sacrificing structure.
Point: 4.6 (10 users)
6
Text to Video

openai/sora-2-pro

OpenAI's Sora 2 Pro model for text-to-video or image-to-video generation.
Point: 4.5 (10 users)
8
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.17 (8 users)
9
Social Media & Viral

wiro/AvatarMotion-Multi

Generate avatars from photos and animate them into engaging videos in one seamless pipeline.
Point: 4.63 (8 users)
5
Fast Inference

google/nano-banana

Google's Gemini 2.5 Flash Image Preview, also known as Nano Banana, model for text-to-image and image-to-image generation.
Point: 4.38 (8 users)
10
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 5 (7 users)
7
Image to Video

xai/grok-imagine-video-extension

Edit or extend a short clip with xAI Grok Imagine. Upload a source video and a detailed prompt to generate a continued MP4 sequence.
Point: 5 (7 users)
7
Generate Videos
Text to Video

ByteDance/seedance-2.0

Seedance 2.0 by ByteDance generates short MP4 videos from text, with optional image and audio references. It supports synced audio, stable motion, and cinematic camera direction.
Point: 5 (6 users)
5
Image to Video

xai/grok-imagine-video-extension

Edit or extend a short clip with xAI Grok Imagine. Upload a source video and a detailed prompt to generate a continued MP4 sequence.
Point: 5 (7 users)
7
Image to Video

xai/grok-imagine-r2v

Create short cinematic videos from a text prompt, guided by up to 16 reference images for consistent style and subjects. Built by xAI.
Point: 5 (7 users)
7
Text to Video

alibaba/wan 2.7

Generate videos from text prompts using Alibaba's WAN 2.7 model. Supports both text-to-video and image-to-video conversion with customizable duration and resolution.
Point: 5 (14 users)
13
Text to Video

xai/grok-imagine-video

Grok Imagine Video by xAI generates short MP4 videos with synced audio from a text prompt, with an optional first-frame image. Choose duration, aspect, and 480p or 720p.
Point: 5 (7 users)
8
Video to Video

wiro/video-background-music-v2

It turns any video into a cinematic experience by generating AI-powered instrumental soundtracks that match its mood.
Point: 4 (1 users)
3
Text to Video

ByteDance/seedance-pro-v1.5-uncensored

Seedance Pro v1.5 Uncensored by ByteDance generates short videos from text with optional native audio, strong prompt following, and accurate lip-sync for dialogue scenes.
Point: 4.74 (54 users)
56
Image to Video

lightricks/ ltx-2.3

This model focuses on the LTX-2.3 model, which is a significant update to the LTX-2 model with improved audio and visual quality as well as enhanced prompt adherence.
Point: 4 (1 users)
2
Text to Video

ByteDance/seedance-v1-pro-fast

Generate videos from text prompts or images using the Seedance V1 Pro Fast model. Supports multiple resolutions and aspect ratios.
Point: 5 (6 users)
5
Image to Video

klingai/kling-v3-motion-control

Transform images into dynamic videos with motion control using Kling V3. Adjust character movement and video quality with customizable parameters.
Point: 5 (12 users)
10
Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.
Point: 5 (3 users)
1
Text to Video

alibaba/wan 2.6

Generates videos from text prompts or images using Alibaba's Wan 2.6 model. Supports multiple modes and customization options.
Point: 5 (11 users)
10
Text to Video

klingai/kling-v3-omni

Generate high-quality videos from text prompts and images with Kling V3 Omni. Supports 720p and 1080p modes, image references, and customizable video editing.
Point: 4 (10 users)
7
Fast Inference

pruna/p-video

Generate videos from text prompts or images using P-Video AI. Supports various resolutions and customization options.
Point: 4.33 (3 users)
3
Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.
Point: 5 (3 users)
0
Text to Video

klingai/kling-v3

Generate high-quality videos from text prompts using Kling V3. Supports custom frames, duration, and aspect ratios.
Point: 4.85 (69 users)
68
Image to Video

ByteDance/DreamActor

Transform images into dynamic videos using ByteDance's DreamActor model. Supports real people, animation, and pets with facial and body movement driving.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.
Point: 4.8 (5 users)
3
Social Media & Viral

wiro/BabyDanceFlow

Transform any character image into a moving video. Provide a reference image and select an video effect.
Point: 3.67 (3 users)
0
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4 (1 users)
0
Generate Images
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 5 (7 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.67 (66 users)
49
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.17 (8 users)
9
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.73 (37 users)
32
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (1 users)
1
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

wiro/FLUX.2-dev-turbo

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.
Point: 0 (0 users)
1
Text to Image

meituan-longcat/LongCat-Image-Edit

LongCat-Image-Edit, the image editing version of LongCat-Image.
Point: 5 (1 users)
0
Text to Image

meituan-longcat/LongCat-Image

LongCat-Image is a 6B-parameter model built for high-quality image generation, delivering strong multilingual text rendering, realistic visuals, and efficient deployment.
Point: 0 (0 users)
0
Text to Image

ByteDance/seedream-v4-5

Access the Seedream 4.5 API for fast high resolution image generation and editing. Simple integration, clear pricing, and support for text to image, multi image input, and advanced creative workflows.
Point: 5 (3 users)
3
Fast Inference

wiro/FLUX.2-dev

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.
Point: 5 (2 users)
4
Text to Image

black-forest-labs/flux-2-flex

Flux 2 Flex model
Point: 4.5 (2 users)
1
Text to Image

black-forest-labs/flux-2-pro

Flux 2 Pro model
Point: 3.5 (2 users)
2
Edit Images
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 5 (7 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.67 (66 users)
49
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.17 (8 users)
9
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.73 (37 users)
32
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (1 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4 (1 users)
0
Text to Image

ByteDance/seedream-v4-5

Access the Seedream 4.5 API for fast high resolution image generation and editing. Simple integration, clear pricing, and support for text to image, multi image input, and advanced creative workflows.
Point: 5 (3 users)
3
Text to Image

black-forest-labs/flux-2-flex

Flux 2 Flex model
Point: 4.5 (2 users)
1
Text to Image

black-forest-labs/flux-2-pro

Flux 2 Pro model
Point: 3.5 (2 users)
2
Image to Image

wiro/Store Image Generator

Create store-ready promo images from your real app screenshots. Pick Apple App Store or Google Play, then control background gradients, font, and text color.
Point: 3.67 (3 users)
1
Generate Text
Fast Inference

nvidia/parakeet-tdt-0.6b-v3

Multilingual speech-to-text for 25 European languages with auto language detection, punctuation, capitalization, and optional timestamps.
Point: 5 (4 users)
7
Fast Inference

CohereLabs/cohere-transcribe-03-2026

CohereLabs cohere-transcribe-03-2026 is a 2B Conformer speech-to-text model for 14 languages. It creates accurate transcripts for meetings, calls, and audio archives.
Point: 5 (4 users)
7
Fast Inference

Qwen/Qwen3-ASR-1.7B

A lightweight speech-to-text model optimized for fast inference. Converts audio input into text with support for multiple languages.
Point: 0 (0 users)
0
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Speech to Text

nvidia/nemotron

Nemotron-Speech-Streaming-En-0.6b is the first unified model in the Nemotron Speech family, engineered to deliver high-quality English transcription across both low-latency streaming and high-throughput batch workloads. The model natively supports punctuation and capitalization and offers runtime flexibility with configurable chunk sizes, including 80ms, 160ms, 560ms, and 1120ms.
Point: 5 (1 users)
0
Speech to Text

elevenlabs/speech-to-text

Speech to text model from ElevenLabs
Point: 0 (0 users)
0
Image to Text

moondream3-preview/detect

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/point

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/caption

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/query

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Speech to Text

openai/whisper-large-v3-turbo-turkish

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.
Point: 5 (1 users)
0
Video to Text

wiro/video-nsfw-detection

NSFW video detection automatically analyzes video content to identify inappropriate or explicit material, ensuring compliance with content policies and a safe viewing environment.
Point: 5 (1 users)
0
nocover
Video to Text

DAMO-NLP-SG/VideoLLaMA3-2B

VideoLLaMA3-2B is a model designed for video understanding.
Point: 0 (0 users)
0
nocover
Image to Text

DAMO-NLP-SG/VideoLLaMA3-2B-Image

VideoLLaMA3-2B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Image to Text

wiro/VideoLLaMA3-7B-Image

VideoLLaMA3-7B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Video to Text

wiro/VideoLLaMA3-7B

VideoLLaMA3-7B is a model designed for video understanding.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip2-flan-t5-xl

BLIP-2 creates captions or detailed descriptions for images. This is BLIP-2 model, leveraging Flan T5-xl.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip-image-captioning-large

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning large model.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip-image-captioning-base

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning base model.
Point: 0 (0 users)
0
Video to Text

wiro/ask-video

Convert the video data into textual descriptions.
Point: 4 (1 users)
0
Generate 3D
3D Generation

microsoft/Trellis-2

Convert images into detailed 3D meshes using Microsoft's Trellis-2 model. Supports various resolutions and customization options.
Point: 5 (1 users)
1
3D Generation

tencent/Hunyuan3D-2.1

Generate 3D models from images using the Hunyuan3D-2.1 AI tool. Transform 2D inputs into detailed 3D assets for design and development.
Point: 5 (2 users)
1
Generate Audio
Fast Inference

k2-fsa/OmniVoice

OmniVoice by k2-fsa generates 24 kHz speech from text in 600+ languages. Clone a speaker from a short reference clip or design a new voice from attributes.
Point: 0 (0 users)
0
Fast Inference

humeai/tada-3b-ml

TADA is a unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment. By leveraging a novel tokenizer and architectural design, TADA achieves high-fidelity synthesis and generation with a fraction of the computational overhead required by traditional models.
Point: 0 (0 users)
0
Fast Inference

fishaudio/s2-pro

Generates high-quality speech from text using advanced TTS technology with support for voice cloning and multi-speaker synthesis.
Point: 0 (0 users)
0
Fast Inference

nineninesix/kani-tts-2-en

Generates natural-sounding speech from text with support for multi-speaker voice cloning and fast inference capabilities.
Point: 0 (0 users)
0
Text to Speech

resemble-ai/chatterbox-turbo

The fastest open source TTS model without sacrificing quality.
Point: 0 (0 users)
0
Text to Speech

resemble-ai/chatterbox-multilingual

Generate expressive, natural speech in 23 languages. Features instant voice cloning from short audio, emotion control, and seamless cross-language voice transfer.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a production long-form dialogue model for expressive multi-speaker conversational audio at scale. It supports long-duration continuity, turn-taking control, and zero-shot voice cloning from short references for podcasts, audiobooks, commentary, dubbing, and entertainment dialogue.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

Qwen/Qwen3-TTS-12Hz-1.7B

A fast inference text-to-speech model optimized for real-time audio generation with multi-language support.
Point: 0 (0 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
Fast Inference

microsoft/VibeVoice-Realtime

VibeVoice-Realtime is a lightweight real‑time text-to-speech model supporting streaming text input and robust long-form speech generation.
Point: 5 (1 users)
1
Text to Speech

openbmb/VoxCPM

Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Point: 0 (0 users)
0
Text to Speech

google/gemini-2.5-tts

Google's Gemini 2.5 Flash Text To Speech Preview model
Point: 5 (1 users)
5
Text to Speech

elevenlabs/text-to-speech

Text to speech model from ElevenLabs
Point: 5 (1 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Speech to Speech

wiro/RVC-Voice-Clone-Youtube

Clone any voice and cover any YouTube song with it. This tool lets you apply a custom-trained RVC voice model to recreate any YouTube song in that voice.
Point: 5 (1 users)
0
nocover
Text to Speech

wiro/kokoro_tts

Kokoro is a high-speed text-to-speech model that delivers clean, accurate, and natural-sounding audio.
Point: 5 (2 users)
2
Generate Music
Music Generation

tencent-ailab/SongGeneration 2

Song Generation 2 generates full songs with vocals and instrumentals from lyrics — supports 14 genres, reference audio cloning, and separate track output (vocal/bgm/mix).
Point: 5 (3 users)
3
Video to Video

wiro/video-background-music-v2

It turns any video into a cinematic experience by generating AI-powered instrumental soundtracks that match its mood.
Point: 4 (1 users)
3
Music Generation

ACE-Step/text-to-song-ACE-Step1.5

ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer hardware.
Point: 5 (4 users)
2
Social Media & Viral

wiro/Song Frame

SongFrame places you into a cinematic world, pulls the soundtrack directly from your YouTube link, and fuses everything into a polished video — effortless, emotional, and instantly shareable.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Video to Video

wiro/video-background-music-gen

It’s a tool that gets your video, creates original music to match its vibe, and seamlessly adds it back as the perfect soundtrack.
Point: 0 (0 users)
0
Music Generation

ACE-Step/image-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from image in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 0 (0 users)
0
Music Generation

ACE-Step/text-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from text prompts in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 5 (1 users)
0
Music Generation

wiro/image-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your image — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/image-to-song-YuE

Turn your image into a full song with vocals and instrumental music in seconds. Just upload your image and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your custom lyrics — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-YuE

Turn your lyrics into a full song with vocals and instrumental music in seconds. Just enter your lyrics and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/music_gen

MusicGen is a text-to-music model capable of generating high-quality music samples.
Point: 0 (0 users)
0
Realtime Stream
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
LLM & Chat
LLM

xai/grok-4-1-fast

Grok 4.1 Fast by xAI is a long-context chat model built for tool calling and agent workflows. It can analyze images and return grounded text answers with sources.
Point: 5 (5 users)
7
LLM

xai/grok-4-20

Grok 4.20 is xAI’s flagship text model with a 2M-token context window, tool calling, and structured JSON outputs. Attach images for vision Q&A and OCR.
Point: 5 (3 users)
5
Chat

Qwen/Qwen3.5-4B-heretic

Decensored Qwen3.5-4B checkpoint for long-context chat, coding, and analysis. Supports optional thinking traces and sampling controls for output style.
Point: 5 (4 users)
6
Chat

Qwen/Qwen3.5-9B-heretic

A large language model optimized for chat interactions and logical reasoning tasks. Designed for developers and researchers.
Point: 5 (2 users)
3
Chat

Qwen/Qwen3.5-4B

A compact yet capable LLM optimized for chat interactions and logical reasoning tasks. Designed for efficient deployment and accurate responses.
Point: 5 (7 users)
7
Chat

Qwen/Qwen3.5-9B

A dense 9-billion parameter language model optimized for chat and reasoning tasks. Designed for efficient deployment and high-quality responses.
Point: 5 (3 users)
4
Chat

Qwen/Qwen3.5-27B-heretic

A large language model optimized for chat interactions and reasoning tasks, designed for advanced users seeking powerful natural language processing capabilities.
Point: 5 (2 users)
3
LLM

bytedance/seed-v2-mini

A lightweight language model optimized for efficient inference and versatile applications in natural language processing tasks.
Point: 5 (3 users)
5
LLM

bytedance/seed-v2-lite

A lightweight text-to-image generation model optimized for visual content creation using prompts and optional images.
Point: 5 (2 users)
4
Chat

Qwen/Qwen3.5-27B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.
Point: 5 (1 users)
1
Partner LLM

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.
Point: 4.83 (15 users)
10
Chat

zai-org/GLM-4.7-Flash

A chat-based language model optimized for conversational AI tasks with support for custom prompts and session management.
Point: 5 (1 users)
1
Partner LLM

openai/gpt-5-nano

A compact AI model optimized for efficient processing of complex prompts and multi-modal inputs with support for images and structured data.
Point: 4.75 (4 users)
4
Partner LLM

openai/gpt-5.2

AI model for processing text prompts and image inputs with advanced reasoning capabilities. Supports multi-modal inputs and customizable responses.
Point: 5 (1 users)
1
Partner LLM

openai/gpt-5-mini

A compact AI model optimized for text generation tasks, designed for efficient processing and accurate responses.
Point: 0 (0 users)
0
Partner LLM

google/gemini-3-flash

gemini-3-flash
Point: 5 (3 users)
3
Partner LLM

google/gemini-2-5-flash

gemini-2-5-flash
Point: 3 (1 users)
0
Chat

wiro/rag-chat-github

Instantly retrieve and analyze content from any GitHub repository. Select your LLM model, extract relevant information from codebases or documentation, and generate context-aware responses with ease!
Point: 0 (0 users)
0
Chat

wiro/rag-chat-youtube

Extract insights directly from YouTube videos by simply providing a URL. Choose your LLM model, access video transcripts or summaries, and create contextually rich conversations effortlessly!
Point: 5 (1 users)
2
Chat

wiro/rag-chat-website

Instantly retrieve and analyze content from any website URL. Select your LLM model, fetch key information from the page, and generate context-aware responses with ease!
Point: 0 (0 users)
0
AI Models for E-commerce
Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.
Point: 5 (3 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.
Point: 4.8 (5 users)
3
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4 (1 users)
0
Social Media & Viral

wiro/Animated Logo

Transform static logos into stunning animated videos with 36+ creative presets. Choose from scenes like Times Square billboards, Parisian storefronts, coffee art, neon signs, and luxury showcases.
Point: 5 (1 users)
1
Social Media & Viral

wiro/3D Text Animations

Create stunning 3D animated text videos with 22+ creative presets. Transform any text into balloon letters, neon signs, candy typography, cloud formations, and cinematic motion effects.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Caption

Combine product images with custom captions into stunning animated video ads. 42 creative presets featuring sales promotions, seasonal themes, and dynamic text animations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Logo

Combine product images with logos into stunning animated video ads. 12 creative presets featuring storefronts, billboards, city banners, and surreal brand presentations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads

Transform product images into stunning animated video ads with 100+ creative presets. Choose from effects like water splashes, scene transitions, surreal staging, and seasonal themes.
Point: 5 (3 users)
0
Ecommerce

wiro/camera-angle-editor

wiro/camera-angle-editor is an advanced AI tool that instantly changes the camera perspective and angle of any existing image. Leveraging sophisticated spatial reconstruction, it eliminates the need for reshoots by synthesizing photorealistic new viewpoints, making it the fastest way for creators to maximize the versatility of their visual content.
Point: 0 (0 users)
0
Ecommerce

wiro/Product Photoshoot

Save time and production costs with AI Product Photoshoot. Generate polished product images featuring adaptive lighting, varied angles, and contextual scenes. Ideal for online stores, marketing teams, and agencies looking to accelerate content creation with consistent, high-quality visuals.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Virtual Try-On

Integrate the Wiro Virtual Try-On API to deliver hyper-realistic apparel fitting directly in your web, mobile, or SaaS platform. Generate lifelike visuals of users wearing new garments with precise texture mapping, pose alignment, and fabric simulation — ideal for online retail and fashion tech solutions.
Point: 5 (3 users)
1
Ecommerce

wiro/text-removal

This AI model intelligently removes unwanted text from any image, seamlessly filling in the background.
Point: 0 (0 users)
0
Ecommerce

wiro/remove-background

AI-powered background removal tool that automatically removes image backgrounds. Perfect for e-commerce product photos and quick image editing.
Point: 4.5 (1 users)
0
AI Models for Social Media Creators
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.
Point: 5 (3 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.
Point: 4.8 (5 users)
3
Social Media & Viral

wiro/BabyDanceFlow

Transform any character image into a moving video. Provide a reference image and select an video effect.
Point: 3.67 (3 users)
0
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4 (1 users)
0
Social Media & Viral

wiro/MotionFlow

Transform any character image into a moving video. Provide a reference image and select an video effect.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Animated Logo

Transform static logos into stunning animated videos with 36+ creative presets. Choose from scenes like Times Square billboards, Parisian storefronts, coffee art, neon signs, and luxury showcases.
Point: 5 (1 users)
1
Social Media & Viral

wiro/3D Text Animations

Create stunning 3D animated text videos with 22+ creative presets. Transform any text into balloon letters, neon signs, candy typography, cloud formations, and cinematic motion effects.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Caption

Combine product images with custom captions into stunning animated video ads. 42 creative presets featuring sales promotions, seasonal themes, and dynamic text animations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Logo

Combine product images with logos into stunning animated video ads. 12 creative presets featuring storefronts, billboards, city banners, and surreal brand presentations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads

Transform product images into stunning animated video ads with 100+ creative presets. Choose from effects like water splashes, scene transitions, surreal staging, and seasonal themes.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Song Frame

SongFrame places you into a cinematic world, pulls the soundtrack directly from your YouTube link, and fuses everything into a polished video — effortless, emotional, and instantly shareable.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Dance Flow

Make anyone dance - turn photos into lively, rhythm-synced dance videos in one seamless flow.
Point: 5 (3 users)
1
Social Media & Viral

wiro/Instagram Pose

Generate stylish Instagram-style poses with trendy angles, natural expressions, and modern aesthetic.
Point: 5 (6 users)
3
Social Media & Viral

wiro/Virtual Try-On

Integrate the Wiro Virtual Try-On API to deliver hyper-realistic apparel fitting directly in your web, mobile, or SaaS platform. Generate lifelike visuals of users wearing new garments with precise texture mapping, pose alignment, and fabric simulation — ideal for online retail and fashion tech solutions.
Point: 5 (3 users)
1
Social Media & Viral

wiro/AvatarMotion-Multi

Generate avatars from photos and animate them into engaging videos in one seamless pipeline.
Point: 4.63 (8 users)
5
Social Media & Viral

wiro/polaroid-effect

Capture life’s spontaneous moments with the dreamy blur and iconic flash of a Polaroid—where every snap feels like a timeless memory in the making.
Point: 5 (4 users)
4
Social Media & Viral

wiro/wan2.2-effects-extra

Generate avatars from photos and animate them into engaging videos in one seamless pipeline.
Point: 5 (2 users)
1
Wiro AI LogoWiro AI LogoLogo of nvidia programLogo of nvidia program
Wiro AI brings machine learning easily accessible to all in the cloud.
  • WIRO
  • About
  • Blog
  • Careers
  • Contact
  • Product
  • Models
  • Pricing
  • Changelog
  • Status
  • FAQ
  • Getting Started
  • Introduction
  • Authentication
  • Projects
  • Code Examples
  • Wiro MCP Server
  • Self-Hosted MCP
  • n8n Integration
  • LLMs.txt
  • API Reference
  • Models
  • Run a Model
  • Model Parameters
  • Tasks
  • LLM & Chat Streaming
  • WebSocket
  • Realtime Voice Conversation
  • Files

2026 © Wiro.ai | Terms of Service & Privacy Policy