Inference
flytekitplugins-inference
Serve models natively in Flyte tasks using inference providers like NIM, Ollama, and others.
pip install flytekitplugins-inferenceQuick Start(example, may need adjustment)
See full examplespip install flytekitplugins-inference
from flytekit import task, workflow
from flytekitplugins.inference import NIM, NIMSecrets, Model, Ollama
config = NIM(...)
@task
def my_task() -> None:
...
@workflow
def my_workflow() -> None:
my_task()Available Imports (6)
Configuration type for Inference.
extends dataclass — configuration or data structure for plugin setup
from flytekitplugins.inference import NIM
Configuration type for Inference.
extends dataclass — configuration or data structure for plugin setup
from flytekitplugins.inference import NIMSecrets
Represents the configuration for a model used in a Kubernetes pod template.
extends dataclass — configuration or data structure for plugin setup
from flytekitplugins.inference import Model
Configuration type for Inference.
from flytekitplugins.inference import Ollama
Configuration type for Inference.
from flytekitplugins.inference import VLLM
Configuration type for Inference.
extends dataclass — configuration or data structure for plugin setup
from flytekitplugins.inference import HFSecret
Dependencies
Related Plugins
SGLang
Serve large language models using SGLang with Flyte Apps.
vLLM
Serve large language models using vLLM with Flyte Apps.
Dgxc-lepton
A professional Flytekit plugin that enables seamless deployment and management of AI inference endpoints using Lepton AI infrastructure within Flyte workflows.
ONNX PyTorch
This plugin allows you to generate ONNX models from your PyTorch models.