## Basic tool info

Tool name: wiro/rag-chat-youtube
Tool description: Extract insights directly from YouTube videos by simply providing a URL. Choose your LLM model, access video transcripts or summaries, and create contextually rich conversations effortlessly!
Tool cover: https://cdn.wiro.ai/uploads/models/wiro-rag-chat-youtube-cover.png
Tool categories:
  - llm
  - persistent
  - tool
  - chat
  - rag
  - youtube

Tool tags:
  - conversational
  -  text-generation-inference
  -  gemma2-chat
  -  text generation
  - wiro-chat
  - question answer
  - chat
  - llama2-chat
  - llama3-chat
  - rag
  - ask file
  - ask document
  - document chat
  - pdf chat

Run Task Endpoint (POST):
https://api.wiro.ai/v1/Run/wiro/rag-chat-youtube

Get Task Detail Endpoint (POST):
https://api.wiro.ai/v1/Task/Detail

## Tool Inputs: 

  - name: selectedModel
    label: select-model
    help: select-model-help
    type: select
    default: 
    options:
      - value: "679"
        label: Qwen/Qwen2.5-7B-Instruct
        description: Qwen 2.5 is the large language model of Qwen, offering advanced natural language understanding and generation capabilities. It’s designed to handle complex tasks with high accuracy, making it ideal for diverse AI applications.
        triggerwords: []
        generatesettings: []

      - value: "730"
        label: m42-health/Llama3-Med42-8B
        description: m42-health/Llama3-Med42-8B is an open-access clinical large language model fine-tuned by M42, based on LLaMA-3, with 8 billion parameters. It is designed to provide high-quality, reliable answers to medical queries, enhancing access to medical knowledge.
        triggerwords: []
        generatesettings: []

      - value: "719"
        label: CohereForAI/aya-expanse-8b
        description: Aya Expanse 8B is an open-weight research release of a model with highly advanced multilingual capabilities. It focuses on pairing a highly performant pre-trained Command family of models with the result of a year’s dedicated research from Cohere For AI, including data arbitrage, multilingual preference training, safety tuning, and model merging. The result is a powerful multilingual large language model.
        triggerwords: []
        generatesettings: []

      - value: "714"
        label: mistralai/Mathstral-7B-v0.1
        description: mistralai/Mathstral-7B-v0.1 is a 7 billion parameter AI model optimized for mathematical reasoning and problem-solving, designed to deliver precise and efficient solutions across a variety of mathematical domains.
        triggerwords: []
        generatesettings: []

      - value: "690"
        label: Qwen/Qwen2.5-Math-1.5B-Instruct
        description: Qwen2.5-Math-1.5B-Instruct is a math-specialized large language model with 1.5 billion parameters, fine-tuned for solving complex mathematical problems and handling instruction-based tasks. It excels in areas like algebra, calculus, and numerical reasoning, making it ideal for educational tools and research applications.
        triggerwords: []
        generatesettings: []

  - name: selectedModelPrivate
    label: select-model-private
    help: select-model-private-help
    type: select
    default: 
    options:


  - name: websiteUrl
    label: Youtube Video URL
    help: Enter a youtube video URL
    type: text
    default: https://youtube.com/watch?v=Yq0QkCxoTHM

  - name: prompt
    label: prompt
    help: Prompt to send to the model.
    type: textarea
    default: What is the YouTube video about?

  - name: user_id
    label: user_id
    help: You can leave it blank. The user_id parameter is a unique identifier for the user. It is used to store and retrieve the chat history specific to that user. You should provide a value that uniquely identifies the user across different sessions. For example, it can be the user’s email address, username, or a system-generated ID.
    type: text
    default: 

  - name: session_id
    label: session_id
    help: You can leave it blank. The session_id parameter represents a specific session for a user. It allows you to manage multiple sessions for the same user. If you want to maintain separate chat histories for different sessions of the same user, use a unique session_id for each session. If not specified or kept the same, the system will treat all interactions as part of the same session.
    type: text
    default: 

  - name: system_prompt
    label: system_prompt
    help: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior.
    type: textarea
    default: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. 
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

  - name: temperature
    label: temperature
    help: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
    type: float
    default: 0.7

  - name: top_p
    label: top_p
    help: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.
    type: float
    default: 0.90

  - name: top_k
    label: top_k
    help: When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.
    type: number
    default: 50

  - name: chunk_size
    label: chunk_size
    help: Defines the size (number of tokens) of each chunk of text that is retrieved and processed by the model during the retrieval phase. Larger values may improve context, but can increase processing time and memory usage.
    type: number
    default: 256

  - name: chunk_overlap
    label: chunk_overlap
    help: Specifies how many tokens from the previous chunk should overlap with the next chunk to ensure continuity and avoid missing important context. Higher overlap ensures smoother transitions but may result in redundant processing.
    type: number
    default: 25

  - name: similarity_top_k
    label: similarity_top_k
    help: Determines the number of most similar documents or chunks to retrieve based on their similarity scores. A higher value increases the amount of context provided but may also introduce irrelevant information.
    type: number
    default: 5

  - name: context_window
    label: context_window
    help: Use 0 to set max limit of the model. Specifies the maximum number of tokens a language model can process at once, including both the input query and retrieved chunks, ensuring the model operates within its token limit.
    type: number
    default: 0

  - name: max_new_tokens
    label: max_new_tokens
    help: Use 0 to set dynamic response limit. Specifies the maximum number of tokens that the language model is allowed to generate in response to a query. This parameter controls the length of the model’s output, helping to prevent overly long or incomplete responses.
    type: number
    default: 0

  - name: seed
    label: seed
    help: seed-help
    type: text
    default: 123456

  - name: quantization
    label: quantization
    help: Quantization is a technique that reduces the precision of model weights (e.g., from FP32 to INT8) to decrease memory usage and improve inference speed. When enabled (true), the model uses less VRAM, making it suitable for resource-constrained environments, but might slightly affect output quality. When disabled (false), the model runs at full precision, ensuring maximum accuracy but requiring more GPU memory and running slower.
    type: checkbox
    default: --quantization

  - name: do_sample
    label: do_sample
    help: The do_sample parameter controls whether the model generates text deterministically or with randomness. For precise tasks like translations or code generation, set do_sample = false to ensure consistent and predictable outputs. For creative tasks like storytelling or poetry, set do_sample = true to allow the model to produce diverse and imaginative results.
    type: checkbox
    default: --do_sample


## Integration Header Prepare 
```bash
  # Sign up Wiro dashboard and create project
  export YOUR_API_KEY="{{useSelectedProjectAPIKey}}"; 
  export YOUR_API_SECRET="XXXXXXXXX"; 

  # unix time or any random integer value
  export NONCE=$(date +%s);

  # hmac-SHA256 (YOUR_API_SECRET+Nonce) with YOUR_API_KEY
  export SIGNATURE="$(echo -n "${YOUR_API_SECRET}${NONCE}" | openssl dgst -sha256 -hmac "${YOUR_API_KEY}")";
```

## Run Command - Make HTTP Post Request 
```bash
curl -X POST "https://api.wiro.ai/v1/Run/wiro/rag-chat-youtube" 
-H "Content-Type: multipart/form-data" 
-H "x-api-key: ${YOUR_API_KEY}" 
-H "x-nonce: ${NONCE}" 
-H "x-signature: ${SIGNATURE}" 
-d '{
  "selectedModel": "",
  "selectedModelPrivate": "",
  "websiteUrl": "https://youtube.com/watch?v=Yq0QkCxoTHM",
  "prompt": "What is the YouTube video about?",
  "user_id": "",
  "session_id": "",
  "system_prompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. \nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
  "temperature": "0.7",
  "top_p": "0.90",
  "top_k": 50,
  "chunk_size": 256,
  "chunk_overlap": 25,
  "similarity_top_k": 5,
  "context_window": 0,
  "max_new_tokens": 0,
  "seed": "123456",
  "quantization": "--quantization",
  "do_sample": "--do_sample",
  "callbackUrl": "You can provide a callback URL; Wiro will send a POST request to it when the task is completed."
}';
```


## Run Command - Response
```json
{
    "errors": [],
    "taskid": "2221",
    "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt",
    "result": true
}
```


## Get Task Detail - Make HTTP Post Request
```bash
curl -X POST "https://api.wiro.ai/v1/Task/Detail"  
-H "Content-Type: multipart/form-data" 
-H "x-api-key: ${YOUR_API_KEY}" 
-H "x-nonce: ${NONCE}" 
-H "x-signature: ${SIGNATURE}" 
-d '{
  "tasktoken": 'eDcCm5yyUfIvMFspTwww49OUfgXkQt',
}';
```

## Get Task Detail - Response
```json
{
  "total": "1",
  "errors": [],
  "tasklist": [
      {
          "id": "2221",
          "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd",
          "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt",
          "parameters": {},
          "debugoutput": "",
          "debugerror": "",
          "starttime": "1734513809",
          "endtime": "1734513813",
          "elapsedseconds": "6.0000",
          "status": "task_postprocess_end",
          "createtime": "1734513807",
          "canceltime": "0",
          "assigntime": "1734513807",
          "accepttime": "1734513807",
          "preprocessstarttime": "1734513807",
          "preprocessendtime": "1734513807",
          "postprocessstarttime": "1734513813",
          "postprocessendtime": "1734513814",
          "outputs": [
              {
                  "id": "6bc392c93856dfce3a7d1b4261e15af3",
                  "name": "0.png",
                  "contenttype": "image/png",
                  "parentid": "6c1833f39da71e6175bf292b18779baf",
                  "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd",
                  "size": "202472",
                  "addedtime": "1734513812",
                  "modifiedtime": "1734513812",
                  "accesskey": "dFKlMApaSgMeHKsJyaDeKrefcHahUK",
                  "url": "https://cdn1.wiro.ai/6a6af820-c5050aee-40bd7b83-a2e186c6-7f61f7da-3894e49c-fc0eeb66-9b500fe2/0.png"
              }
          ],
          "size": "202472"
          }
      ],
  "result": true
}
```


## Task Status Information
This section defines the possible task status values returned by the API when polling for task completion.

### Completed Task Statuses (Polling can stop)
These indicate that the task has reached a terminal state — either success or failure. Once any of these is received, polling should stop.
- task_postprocess_end : Task completed successfully and post-processing is done.
- task_cancel : Task was cancelled by the user or system.

### Running Task Statuses (Continue polling)
These statuses indicate that the task is still in progress. Polling should continue if one of these is returned.
- task_queue : Task is waiting in the queue.
- task_accept : Task has been accepted for processing.
- task_assign : Task is being assigned to a worker.
- task_preprocess_start : Preprocessing is starting.
- task_preprocess_end : Preprocessing is complete.
- task_start : Task execution has started.
- task_output : Output is being generated.