## Basic tool info Tool name: microsoft/Phi-3.5-mini-instruct Tool description: microsoft/Phi-3.5-mini-instruct is a lightweight AI language model optimized for instruction-based tasks, offering efficient performance and high-quality responses with a compact architecture. Tool cover: https://cdn.wiro.ai/uploads/models/microsoft-Phi-3.5-mini-instruct-cover.jpg Tool categories: - model - llm - persistent - chat - checkpoint-folder - bf16 - phi3.5-chat - rag Tool tags: - text generation - transformers - safetensors - multilingual - phi3 - nlp - code - conversational - custom_code - text-generation-inference - inference endpoints - mit Run Task Endpoint (POST): https://api.wiro.ai/v1/Run/microsoft/Phi-3.5-mini-instruct Get Task Detail Endpoint (POST): https://api.wiro.ai/v1/Task/Detail ## Tool Inputs: - name: prompt label: prompt help: Prompt to send to the model. type: textarea default: What are some interesting historical events that took place near the Tower of London, and how could they inspire a fictional story? - name: user_id label: user_id help: You can leave it blank. The user_id parameter is a unique identifier for the user. It is used to store and retrieve the chat history specific to that user. You should provide a value that uniquely identifies the user across different sessions. For example, it can be the user’s email address, username, or a system-generated ID. type: text default: - name: session_id label: session_id help: You can leave it blank. The session_id parameter represents a specific session for a user. It allows you to manage multiple sessions for the same user. If you want to maintain separate chat histories for different sessions of the same user, use a unique session_id for each session. If not specified or kept the same, the system will treat all interactions as part of the same session. type: text default: - name: system_prompt label: system_prompt help: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. type: textarea default: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. - name: temperature label: temperature help: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value. type: float default: 0.7 - name: top_p label: top_p help: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens. type: float default: 0.95 - name: top_k label: top_k help: When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens. type: number default: 0 - name: repetition_penalty label: repetition_penalty help: It is a hyperparameter used to reduce the likelihood of the model generating repetitive text by applying a penalty to previously generated tokens, encouraging more diverse and coherent output. type: float default: 1.0 - name: length_penalty label: length_penalty help: A parameter that controls how long the outputs are. If < 1, the model will tend to generate shorter outputs, and > 1 will tend to generate longer outputs. type: float default: 1 - name: max_tokens label: max_tokens help: Maximum number of tokens to generate. A word is generally 2-3 tokens. type: number default: 0 - name: min_tokens label: min_tokens help: Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens. type: number default: 0 - name: max_new_tokens label: max_new_tokens help: This parameter has been renamed to max_tokens. max_new_tokens only exists for backwards compatibility purposes. We recommend you use max_tokens instead. Both may not be specified. type: number default: 0 - name: min_new_tokens label: min_new_tokens help: This parameter has been renamed to min_tokens. min_new_tokens only exists for backwards compatibility purposes. We recommend you use min_tokens instead. Both may not be specified. type: number default: -1 - name: stop_sequences label: stop_sequences help: A semicolon-separated list of sequences to stop generation at. For example, ';' will stop generation at the first instance of 'end' or ''. type: text default: - name: seed label: seed help: seed-help type: text default: 123456 - name: quantization label: quantization help: Quantization is a technique that reduces the precision of model weights (e.g., from FP32 to INT8) to decrease memory usage and improve inference speed. When enabled (true), the model uses less VRAM, making it suitable for resource-constrained environments, but might slightly affect output quality. When disabled (false), the model runs at full precision, ensuring maximum accuracy but requiring more GPU memory and running slower. type: checkbox default: --quantization - name: do_sample label: do_sample help: The do_sample parameter controls whether the model generates text deterministically or with randomness. For precise tasks like translations or code generation, set do_sample = false to ensure consistent and predictable outputs. For creative tasks like storytelling or poetry, set do_sample = true to allow the model to produce diverse and imaginative results. type: checkbox default: --do_sample ## Integration Header Prepare ```bash # Sign up Wiro dashboard and create project export YOUR_API_KEY="{{useSelectedProjectAPIKey}}"; export YOUR_API_SECRET="XXXXXXXXX"; # unix time or any random integer value export NONCE=$(date +%s); # hmac-SHA256 (YOUR_API_SECRET+Nonce) with YOUR_API_KEY export SIGNATURE="$(echo -n "${YOUR_API_SECRET}${NONCE}" | openssl dgst -sha256 -hmac "${YOUR_API_KEY}")"; ``` ## Run Command - Make HTTP Post Request ```bash curl -X POST "https://api.wiro.ai/v1/Run/microsoft/Phi-3.5-mini-instruct" -H "Content-Type: multipart/form-data" -H "x-api-key: ${YOUR_API_KEY}" -H "x-nonce: ${NONCE}" -H "x-signature: ${SIGNATURE}" -d '{ "prompt": "What are some interesting historical events that took place near the Tower of London, and how could they inspire a fictional story?", "user_id": "", "session_id": "", "system_prompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. \nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.", "temperature": "0.7", "top_p": "0.95", "top_k": 0, "repetition_penalty": "1.0", "length_penalty": "1", "max_tokens": 0, "min_tokens": 0, "max_new_tokens": 0, "min_new_tokens": -1, "stop_sequences": "", "seed": "123456", "quantization": "--quantization", "do_sample": "--do_sample", "callbackUrl": "You can provide a callback URL; Wiro will send a POST request to it when the task is completed." }'; ``` ## Run Command - Response ```json { "errors": [], "taskid": "2221", "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt", "result": true } ``` ## Get Task Detail - Make HTTP Post Request ```bash curl -X POST "https://api.wiro.ai/v1/Task/Detail" -H "Content-Type: multipart/form-data" -H "x-api-key: ${YOUR_API_KEY}" -H "x-nonce: ${NONCE}" -H "x-signature: ${SIGNATURE}" -d '{ "tasktoken": 'eDcCm5yyUfIvMFspTwww49OUfgXkQt', }'; ``` ## Get Task Detail - Response ```json { "total": "1", "errors": [], "tasklist": [ { "id": "2221", "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd", "socketaccesstoken": "eDcCm5yyUfIvMFspTwww49OUfgXkQt", "parameters": {}, "debugoutput": "", "debugerror": "", "starttime": "1734513809", "endtime": "1734513813", "elapsedseconds": "6.0000", "status": "task_postprocess_end", "createtime": "1734513807", "canceltime": "0", "assigntime": "1734513807", "accepttime": "1734513807", "preprocessstarttime": "1734513807", "preprocessendtime": "1734513807", "postprocessstarttime": "1734513813", "postprocessendtime": "1734513814", "outputs": [ { "id": "6bc392c93856dfce3a7d1b4261e15af3", "name": "0.png", "contenttype": "image/png", "parentid": "6c1833f39da71e6175bf292b18779baf", "uuid": "15bce51f-442f-4f44-a71d-13c6374a62bd", "size": "202472", "addedtime": "1734513812", "modifiedtime": "1734513812", "accesskey": "dFKlMApaSgMeHKsJyaDeKrefcHahUK", "url": "https://cdn1.wiro.ai/6a6af820-c5050aee-40bd7b83-a2e186c6-7f61f7da-3894e49c-fc0eeb66-9b500fe2/0.png" } ], "size": "202472" } ], "result": true } ``` ## Task Status Information This section defines the possible task status values returned by the API when polling for task completion. ### Completed Task Statuses (Polling can stop) These indicate that the task has reached a terminal state — either success or failure. Once any of these is received, polling should stop. - task_postprocess_end : Task completed successfully and post-processing is done. - task_cancel : Task was cancelled by the user or system. ### Running Task Statuses (Continue polling) These statuses indicate that the task is still in progress. Polling should continue if one of these is returned. - task_queue : Task is waiting in the queue. - task_accept : Task has been accepted for processing. - task_assign : Task is being assigned to a worker. - task_preprocess_start : Preprocessing is starting. - task_preprocess_end : Preprocessing is complete. - task_start : Task execution has started. - task_output : Output is being generated.