
farhan0167/otto-m8: Flowchart-like UI to interconnect LLM’s and Huggingface models, and deploy them as a REST API with little to no code.
A flowchart-based automation platform that runs deep learning workloads with minimal or no coding.
otto-m8 (automation) allows users to launch various artificial intelligence models, from traditional deep learning models to large language models, through a flowchart-like user interface. The core of otto-m8 is a Docker container for deploying workflows. You can use it as an API to integrate with existing workflows, build AI assistant chatbots, or use it as a standalone API/application.
The idea is simple – provide an easy-to-use user interface to launch AI models. A large amount of the code required to run AI models (LLM and traditional deep learning models) are blocks of boilerplate code, including the deployment of the REST API that typically serves the model. The goal of otto-m8 is not only to abstract through the code, but also to abstract the entire process into a UI. Otto-m8 operates using an input, process, output paradigm, where each stream has some form of input, is processed through a series of processes, and then output.
Currently this is an MVP and source code is provided, which means this is not an open source software.
- Prerequisites: Make sure Docker or Docker Desktop is installed on your computer, and in order to execute Ollama blocks, make sure the Ollama server is running in the background.
- Run the following command to make
run.sh
executable file
- Then start the application:
This should start the dashboard and server. To access the dashboard, go to http://localhost:3000/
. Access the dashboard using default login credentials and start your first workflow.
Below is an example workflow that combines Langchain’s PDF Parser to create a workflow:
To run it as an API:
import requests
import base64
import json
# Find the deployment URL on the Template page
deployment_url = "http://localhost:8001/workflow_run"
path_to_pdf = "./AMZN-Q1-2024-Earnings-Release.pdf"
# Any kind of upload documents expect a base64 encoded string.
with open(path_to_pdf, "rb") as f:
data = f.read()
data_base64 = base64.b64encode(data).decode("utf-8")
# Based on the Block's displayed name(`id` on Block config tab), append your data:
payload = {
"Langchain_PDF_Parser": data_base64,
"Input_Block": "What was amazon's net sales?"
}
request = requests.post(
deployment_url,
json={"data": payload}
)
response = request.json()['message']
response = json.loads(response)
print(response)
Output:
"""
{
"f92cffae-14d2-43f4-a961-2fcd5829f1bc": {
"id": "chatcmpl-AgOrZExPgef0TzHVGNVJJr1vmrJyR",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "In the first quarter of 2024, Amazon's net sales increased by 13% to $143.3 billion, compared with $127.4 billion in the first quarter of 2023.",
"refusal": null,
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
],
"created": 1734668713,
"model": "gpt-4o-mini-2024-07-18",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": "fp_0aa8d3e20b",
"usage": {
"completion_tokens": 42,
"prompt_tokens": 13945,
"total_tokens": 13987,
"completion_tokens_details": {
"audio_tokens": 0,
"reasoning_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 13824
}
},
"conversation": [
{
"role": "user",
"content": "What was amazon's net sales?"
},
{
"role": "assistant",
"content": "In the first quarter .."
}
]
}
}
"""
Use the chat output block to use the chat interface:
You can run almost any Huggingface model that can be run through Huggingface’s pipeline abstraction (although not really). Here is a simple demonstration Salesforce/blip-image-captioning-base
Model.
2024-12-21 21:39:44