## Basic tool info

Tool name: wiro/chat
Tool description: Wiro LLM Conversational chat script! Runs every LLM with one tool.
Tool cover: https://cdn.wiro.ai/uploads/models/wiro-chat-cover.jpg
Tool categories:
  - llm
  - persistent
  - tool
  - chat

Tool tags:
  - conversational
  -  text-generation-inference
  -  gemma2-chat
  -  text generation
  - wiro-chat
  - question answer
  - chat
  - llama2-chat
  - llama3-chat

Run Task Endpoint (POST):
https://api.wiro.ai/v1/Run/wiro/chat

Get Task Detail Endpoint (POST):
https://api.wiro.ai/v1/Task/Detail

## Tool Inputs: 

  - name: selectedModel
    label: select-model
    help: select-model-help
    type: select
    default: "617"
    options:
      - value: "1141"
        label: AlicanKiraz0/SenecaLLM-x-QwQ-32B-Q8_Max-Version
        description: Mixed dataset LLM Information Security v1.5 - Incident Response v1.3.1 - Threat Hunting v1.3.2 - Ethical Exploit Development v2.0 - Purple Team Tactics v1.3 - Reverse Engineering v2.0
        triggerwords: []
        generatesettings: []

      - value: "757"
        label: Qwen/Qwen2.5-14B-Instruct
        description: Qwen2.5-14B-Instruct is a large language model by Alibaba’s Qwen team, optimized for instruction-following tasks with 14 billion parameters. It offers strong reasoning, multilingual capabilities, and efficient performance, making it suitable for chatbots, content creation, and various AI-driven applications.
        triggerwords: []
        generatesettings: []

      - value: "756"
        label: Qwen/Qwen2.5-32B-Instruct
        description: Qwen2.5-32B-Instruct is a powerful large language model developed by Alibaba’s Qwen team, designed for instruction-following tasks with enhanced reasoning and natural language understanding capabilities. Optimized for efficiency and accuracy, it supports multi-turn conversations and complex queries, making it suitable for applications such as chatbots, content generation, and AI assistants.
        triggerwords: []
        generatesettings: []

      - value: "743"
        label: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
        description: DeepSeek-R1-Distill-Qwen-32B is a distilled version of the DeepSeek-R1 model with 32 billion parameters, leveraging the Qwen architecture to deliver high-quality language understanding and generation. Through knowledge distillation, it retains the strengths of larger models while offering improved efficiency and reduced computational requirements. This model is ideal for large-scale AI applications that demand robust performance with optimized resource utilization.
        triggerwords: []
        generatesettings: []

      - value: "742"
        label: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
        description: DeepSeek-R1-Distill-Qwen-14B is a distilled version of the DeepSeek-R1 model, featuring 14 billion parameters. Built on the Qwen architecture, it utilizes advanced knowledge distillation techniques to achieve a balance between high performance and computational efficiency. This model is well-suited for a wide range of natural language processing tasks, providing accurate and context-aware responses while optimizing resource consumption for deployment in production environments.
        triggerwords: []
        generatesettings: []

      - value: "741"
        label: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
        description: DeepSeek-R1-Distill-Llama-8B is a distilled version of the DeepSeek-R1 model based on the LLaMA architecture, featuring 8 billion parameters. It offers a balance between performance and efficiency by leveraging knowledge distillation techniques to reduce computational costs while maintaining high-quality language processing capabilities. This model is ideal for applications that require powerful text generation and understanding with optimized resource usage.
        triggerwords: []
        generatesettings: []

      - value: "740"
        label: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
        description: DeepSeek-R1-Distill-Qwen-7B is a distilled version of the DeepSeek-R1 model with 7 billion parameters, designed to provide high-quality language understanding while optimizing efficiency. Leveraging advanced knowledge distillation techniques, it retains the core capabilities of larger models with improved speed and lower resource consumption. This model is well-suited for tasks requiring robust natural language processing while maintaining cost-effective deployment.
        triggerwords: []
        generatesettings: []

      - value: "739"
        label: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
        description: DeepSeek-R1-Distill-Qwen-1.5B is a distilled version of the DeepSeek-R1 model, designed to offer a balance between efficiency and performance. With 1.5 billion parameters, it leverages knowledge distillation techniques to retain the capabilities of larger models while optimizing for faster inference and reduced resource consumption. Ideal for applications requiring high-quality language understanding with lower computational overhead.
        triggerwords: []
        generatesettings: []

      - value: "735"
        label: utter-project/EuroLLM-9B-Instruct
        description: utter-project/EuroLLM-9B-Instruct is a 9 billion parameter multilingual AI model developed to understand and generate text across all European Union languages and additional relevant languages. It has been trained on a vast dataset of 4 trillion tokens and further fine-tuned on EuroBlocks to improve its performance in instruction-following and machine translation tasks.
        triggerwords: []
        generatesettings: []

      - value: "734"
        label: utter-project/EuroLLM-1.7B-Instruct
        description: utter-project/EuroLLM-1.7B-Instruct is a 1.7 billion parameter multilingual AI model designed to understand and generate text in all European Union languages and additional relevant languages. It has been trained on 4 trillion tokens from diverse sources and further instruction-tuned on EuroBlocks to enhance its capabilities in general instruction-following and machine translation.
        triggerwords: []
        generatesettings: []

      - value: "730"
        label: m42-health/Llama3-Med42-8B
        description: m42-health/Llama3-Med42-8B is an open-access clinical large language model fine-tuned by M42, based on LLaMA-3, with 8 billion parameters. It is designed to provide high-quality, reliable answers to medical queries, enhancing access to medical knowledge.
        triggerwords: []
        generatesettings: []

      - value: "728"
        label: meta-llama/Llama-3.2-3B-Instruct
        description: The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
        triggerwords: []
        generatesettings: []

      - value: "726"
        label: meta-llama/CodeLlama-34b-Instruct-hf
        description: meta-llama/CodeLlama-34B-Instruct-hf is a 34 billion parameter instruction-tuned AI model developed by Meta, designed to provide advanced code generation, completion, and understanding capabilities across various programming languages, offering high accuracy and efficiency for complex coding tasks.
        triggerwords: []
        generatesettings: []

      - value: "725"
        label: meta-llama/CodeLlama-7b-Instruct-hf
        description: meta-llama/CodeLlama-7B-Instruct-hf is a 7 billion parameter instruction-tuned AI model developed by Meta, designed to assist with code generation, completion, and understanding across multiple programming languages with high efficiency and accuracy.
        triggerwords: []
        generatesettings: []

      - value: "720"
        label: internlm/internlm3-8b-instruct
        description: InternLM3 has open-sourced an 8-billion parameter instruction model, InternLM3-8B-Instruct, designed for general-purpose usage and advanced reasoning.
        triggerwords: []
        generatesettings: []

      - value: "719"
        label: CohereForAI/aya-expanse-8b
        description: Aya Expanse 8B is an open-weight research release of a model with highly advanced multilingual capabilities. It focuses on pairing a highly performant pre-trained Command family of models with the result of a year’s dedicated research from Cohere For AI, including data arbitrage, multilingual preference training, safety tuning, and model merging. The result is a powerful multilingual large language model.
        triggerwords: []
        generatesettings: []

      - value: "717"
        label: mistralai/Mistral-Nemo-Instruct-2407
        description: The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size.
        triggerwords: []
        generatesettings: []

      - value: "716"
        label: HuggingFaceTB/SmolLM2-1.7B-Instruct
        description: HuggingFaceTB/SmolLM2-1.7B-Instruct is a 1.7 billion parameter instruction-tuned AI language model designed to provide efficient and accurate responses for a wide range of tasks while maintaining a lightweight and accessible architecture.
        triggerwords: []
        generatesettings: []

      - value: "714"
        label: mistralai/Mathstral-7B-v0.1
        description: mistralai/Mathstral-7B-v0.1 is a 7 billion parameter AI model optimized for mathematical reasoning and problem-solving, designed to deliver precise and efficient solutions across a variety of mathematical domains.
        triggerwords: []
        generatesettings: []

      - value: "713"
        label: deepseek-ai/deepseek-math-7b-instruct
        description: deepseek-ai/deepseek-math-7b-instruct is a 7 billion parameter AI model specifically designed to excel in mathematical reasoning and problem-solving tasks, providing accurate and detailed step-by-step solutions.
        triggerwords: []
        generatesettings: []

      - value: "712"
        label: microsoft/Phi-3.5-mini-instruct
        description: microsoft/Phi-3.5-mini-instruct is a lightweight AI language model optimized for instruction-based tasks, offering efficient performance and high-quality responses with a compact architecture.
        triggerwords: []
        generatesettings: []

      - value: "711"
        label: Qwen/Qwen2.5-3B-Instruct
        description: Qwen/Qwen2.5-3B-Instruct is a powerful 3 billion parameter AI language model designed to deliver high-quality, human-like responses across a wide range of natural language processing tasks.
        triggerwords: []
        generatesettings: []

      - value: "710"
        label: Qwen/Qwen2.5-0.5B-Instruct
        description: Qwen/Qwen2.5-0.5B-Instruct is a compact 500 million parameter AI language model optimized for generating human-like responses and assisting with various natural language understanding tasks.
        triggerwords: []
        generatesettings: []

      - value: "709"
        label: Qwen/Qwen2.5-1.5B-Instruct
        description: Qwen/Qwen2.5-1.5B-Instruct is a 1.5 billion parameter AI language model designed to generate human-like responses based on user inputs.
        triggerwords: []
        generatesettings: []

      - value: "692"
        label: Qwen/Qwen2-7B-Instruct
        description: Qwen2-7B-Instruct is a versatile large language model with 7 billion parameters, optimized for instruction-following tasks. It delivers high performance in natural language understanding, content generation, and complex reasoning, making it ideal for a wide range of AI-driven applications, from chatbots to educational tools.
        triggerwords: []
        generatesettings: []

      - value: "691"
        label: Qwen/Qwen2.5-Math-7B-Instruct
        description: Qwen2.5-Math-7B-Instruct is a large language model with 7 billion parameters, specialized in advanced mathematical problem-solving and reasoning. It is fine-tuned for instruction-based tasks and excels in fields like algebra, calculus, and quantitative analysis, making it suitable for research, education, and technical applications.
        triggerwords: []
        generatesettings: []

      - value: "690"
        label: Qwen/Qwen2.5-Math-1.5B-Instruct
        description: Qwen2.5-Math-1.5B-Instruct is a math-specialized large language model with 1.5 billion parameters, fine-tuned for solving complex mathematical problems and handling instruction-based tasks. It excels in areas like algebra, calculus, and numerical reasoning, making it ideal for educational tools and research applications.
        triggerwords: []
        generatesettings: []

      - value: "689"
        label: Qwen/Qwen2.5-Coder-7B-Instruct
        description: Qwen2.5-Coder-7B-Instruct is a specialized large language model with 7 billion parameters, designed for code understanding and generation. Fine-tuned for instruction-based tasks, it supports multiple programming languages, making it ideal for coding assistance, code completion, and educational applications.
        triggerwords: []
        generatesettings: []

      - value: "688"
        label: Qwen/Qwen1.5-0.5B-Chat
        description: Qwen1.5-0.5B-Chat is a lightweight conversational AI model with 500 million parameters, designed for efficient and interactive chat-based applications. Despite its smaller size, it offers fast performance and reliable natural language understanding, making it ideal for real-time AI interactions on resource-constrained systems.
        triggerwords: []
        generatesettings: []

      - value: "686"
        label: microsoft/phi-4
        description: PHI-4 by Microsoft is an advanced large language model designed for precise natural language understanding and generation. It offers robust performance in complex conversational AI and NLP tasks, making it well-suited for enterprise and research applications.
        triggerwords: []
        generatesettings: []

      - value: "685"
        label: Qwen/Qwen2.5-Coder-32B-Instruct
        description: Qwen2.5-Coder-32B-Instruct is a powerful code-focused large language model with 32 billion parameters, fine-tuned for coding and instruction-based tasks. It excels in code generation, code completion, and understanding complex programming instructions across multiple languages.
        triggerwords: []
        generatesettings: []

      - value: "684"
        label: google/gemma-2-2b-it
        description: Gemma-2-2B is a compact large language model by Google, featuring 2 billion parameters and optimized for instruction-based and multilingual tasks. It provides efficient natural language processing, making it suitable for a variety of AI-driven applications with lower computational requirements.
        triggerwords: []
        generatesettings: []

      - value: "683"
        label: google/gemma-2-9b-it
        description: Gemma-2-9B is an advanced large language model by Google, featuring 9 billion parameters and designed for multilingual and instruction-following tasks. It delivers high-quality performance in complex natural language understanding and generation across various domains.
        triggerwords: []
        generatesettings: []

      - value: "682"
        label: meta-llama/Meta-Llama-3-8B-Instruct
        description: Meta-Llama 3-8B-Instruct is a state-of-the-art large language model with 8 billion parameters, optimized for instruction-based tasks. It excels in understanding and generating natural language, making it ideal for a wide range of AI-driven applications.
        triggerwords: []
        generatesettings: []

      - value: "681"
        label: mistralai/Mistral-7B-Instruct-v0.3
        description: Mistral-7B-Instruct-v0.3 is a highly efficient large language model with 7 billion parameters, fine-tuned for instruction-following tasks. It delivers strong performance across a range of applications, offering reliable responses and enhanced task comprehension.
        triggerwords: []
        generatesettings: []

      - value: "680"
        label: meta-llama/Llama-3.1-8B-Instruct
        description: Llama 3.1-8B-Instruct is an advanced large language model designed for instruction-based tasks and fine-tuned for improved user interaction. With 8 billion parameters, it offers enhanced accuracy and versatility across various AI applications.
        triggerwords: []
        generatesettings: []

      - value: "679"
        label: Qwen/Qwen2.5-7B-Instruct
        description: Qwen 2.5 is the large language model of Qwen, offering advanced natural language understanding and generation capabilities. It’s designed to handle complex tasks with high accuracy, making it ideal for diverse AI applications.
        triggerwords: []
        generatesettings: []

      - value: "676"
        label: wiro/wiroai-turkish-llm-8b
        description:  Meet with Wiro AI's Turkish Large Language Model (LLM)! A robust language model with more Turkish language and culture support!
        triggerwords: []
        generatesettings: []

      - value: "675"
        label: wiro/wiroai-turkish-llm-9b
        description: Meet with Wiro AI's Turkish Large Language Model (LLM)! A robust language model with more Turkish language and culture support!
        triggerwords: []
        generatesettings: []

      - value: "617"
        label: meta-llama/Llama-2-7b-chat-hf
        description:  
        triggerwords: []
        generatesettings: []

  - name: selectedModelPrivate
    label: select-model-private
    help: select-model-private-help
    type: select
    default: ""
    options:


  - name: prompt
    label: prompt
    help: Prompt to send to the model.
    type: textarea
    default: 

  - name: user_id
    label: user_id
    help: You can leave it blank. The user_id parameter is a unique identifier for the user. It is used to store and retrieve the chat history specific to that user. You should provide a value that uniquely identifies the user across different sessions. For example, it can be the user’s email address, username, or a system-generated ID.
    type: text
    default: 

  - name: session_id
    label: session_id
    help: You can leave it blank. The session_id parameter represents a specific session for a user. It allows you to manage multiple sessions for the same user. If you want to maintain separate chat histories for different sessions of the same user, use a unique session_id for each session. If not specified or kept the same, the system will treat all interactions as part of the same session.
    type: text
    default: 

  - name: system_prompt
    label: system_prompt
    help: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior.
    type: textarea
    default: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. 
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

  - name: temperature
    label: temperature
    help: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
    type: float
    default: 0.7

  - name: top_p
    label: top_p
    help: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.
    type: float
    default: 0.95

  - name: top_k
    label: top_k
    help: When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.
    type: number
    default: 0

  - name: repetition_penalty
    label: repetition_penalty
    help: It is a hyperparameter used to reduce the likelihood of the model generating repetitive text by applying a penalty to previously generated tokens, encouraging more diverse and coherent output.
    type: float
    default: 1.0

  - name: length_penalty
    label: length_penalty
    help: A parameter that controls how long the outputs are. If < 1, the model will tend to generate shorter outputs, and > 1 will tend to generate longer outputs.
    type: float
    default: 1

  - name: max_tokens
    label: max_tokens
    help: Use 0 to set max limit of the model. Maximum number of tokens to generate. A word is generally 2-3 tokens.
    type: number
    default: 0

  - name: min_tokens
    label: min_tokens
    help: Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
    type: number
    default: 0

  - name: max_new_tokens
    label: max_new_tokens
    help: Use 0 to set dynamic response limit. This parameter has been renamed to max_tokens. max_new_tokens only exists for backwards compatibility purposes. We recommend you use max_tokens instead. Both may not be specified.
    type: number
    default: 0

  - name: min_new_tokens
    label: min_new_tokens
    help: This parameter has been renamed to min_tokens. min_new_tokens only exists for backwards compatibility purposes. We recommend you use min_tokens instead. Both may not be specified.
    type: number
    default: -1

  - name: stop_sequences
    label: stop_sequences
    help: A semicolon-separated list of sequences to stop generation at. For example, '<end>;<stop>' will stop generation at the first instance of 'end' or '<stop>'.
    type: text
    default: 

  - name: seed
    label: seed
    help: seed-help
    type: text
    default: 123456

  - name: quantization
    label: quantization
    help: Quantization is a technique that reduces the precision of model weights (e.g., from FP32 to INT8) to decrease memory usage and improve inference speed. When enabled (true), the model uses less VRAM, making it suitable for resource-constrained environments, but might slightly affect output quality. When disabled (false), the model runs at full precision, ensuring maximum accuracy but requiring more GPU memory and running slower.
    type: checkbox
    default: --quantization

  - name: do_sample
    label: do_sample
    help: The do_sample parameter controls whether the model generates text deterministically or with randomness. For precise tasks like translations or code generation, set do_sample = false to ensure consistent and predictable outputs. For creative tasks like storytelling or poetry, set do_sample = true to allow the model to produce diverse and imaginative results.
    type: checkbox
    default: --do_sample


## Integration Header Prepare 
```bash
  # Sign up Wiro dashboard and create project
  export YOUR_API_KEY="{{useSelectedProjectAPIKey}}"; 
  export YOUR_API_SECRET="XXXXXXXXX"; 

  # unix time or any random integer value
  export NONCE=$(date +%s);

  # hmac-SHA256 (YOUR_API_SECRET+Nonce) with YOUR_API_KEY
  export SIGNATURE="$(echo -n "${YOUR_API_SECRET}${NONCE}" | openssl dgst -sha256 -hmac "${YOUR_API_KEY}")";
```

## Run Command - Make HTTP Post Request 
```bash
curl -X POST "https://api.wiro.ai/v1/Run/wiro/chat" 
-H "Content-Type: multipart/form-data" 
-H "x-api-key: ${YOUR_API_KEY}" 
-H "x-nonce: ${NONCE}" 
-H "x-signature: ${SIGNATURE}" 
-d '{
  "selectedModel": "",
  "selectedModelPrivate": "",
  "prompt": "",
  "user_id": "",
  "session_id": "",
  "system_prompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. \nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
  "temperature": "0.7",
  "top_p": "0.95",
  "top_k": 0,
  "repetition_penalty": "1.0",
  "length_penalty": "1",
  "max_tokens": 0,
  "min_tokens": 0,
  "max_new_tokens": 0,
  "min_new_tokens": -1,
  "stop_sequences": "",
  "seed": "123456",
  "quantization": "--quantization",
  "do_sample": "--do_sample",
  "callbackUrl": "You can provide a callback URL; Wiro will send a POST request to it when the task is completed."
}';
```


## Run Command - Response
```json
{
    "errors": [],
    "taskid": "2221",
    "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt",
    "result": true
}
```


## Get Task Detail - Make HTTP Post Request
```bash
curl -X POST "https://api.wiro.ai/v1/Task/Detail"  
-H "Content-Type: multipart/form-data" 
-H "x-api-key: ${YOUR_API_KEY}" 
-H "x-nonce: ${NONCE}" 
-H "x-signature: ${SIGNATURE}" 
-d '{
  "tasktoken": 'eDcCm5yyUfIvMFspTwww49OUfgXkQt',
}';
```

## Get Task Detail - Response
```json
{
  "total": "1",
  "errors": [],
  "tasklist": [
      {
          "id": "2221",
          "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd",
          "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt",
          "parameters": {},
          "debugoutput": "",
          "debugerror": "",
          "starttime": "1734513809",
          "endtime": "1734513813",
          "elapsedseconds": "6.0000",
          "status": "task_postprocess_end",
          "createtime": "1734513807",
          "canceltime": "0",
          "assigntime": "1734513807",
          "accepttime": "1734513807",
          "preprocessstarttime": "1734513807",
          "preprocessendtime": "1734513807",
          "postprocessstarttime": "1734513813",
          "postprocessendtime": "1734513814",
          "outputs": [
              {
                  "id": "6bc392c93856dfce3a7d1b4261e15af3",
                  "name": "0.png",
                  "contenttype": "image/png",
                  "parentid": "6c1833f39da71e6175bf292b18779baf",
                  "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd",
                  "size": "202472",
                  "addedtime": "1734513812",
                  "modifiedtime": "1734513812",
                  "accesskey": "dFKlMApaSgMeHKsJyaDeKrefcHahUK",
                  "url": "https://cdn1.wiro.ai/6a6af820-c5050aee-40bd7b83-a2e186c6-7f61f7da-3894e49c-fc0eeb66-9b500fe2/0.png"
              }
          ],
          "size": "202472"
          }
      ],
  "result": true
}
```


## Task Status Information
This section defines the possible task status values returned by the API when polling for task completion.

### Completed Task Statuses (Polling can stop)
These indicate that the task has reached a terminal state — either success or failure. Once any of these is received, polling should stop.
- task_postprocess_end : Task completed successfully and post-processing is done.
- task_cancel : Task was cancelled by the user or system.

### Running Task Statuses (Continue polling)
These statuses indicate that the task is still in progress. Polling should continue if one of these is returned.
- task_queue : Task is waiting in the queue.
- task_accept : Task has been accepted for processing.
- task_assign : Task is being assigned to a worker.
- task_preprocess_start : Preprocessing is starting.
- task_preprocess_end : Preprocessing is complete.
- task_start : Task execution has started.
- task_output : Output is being generated.