Inference

flytekitplugins-inference

FlytekitModel Servinginferenceservingnimollama

Serve models natively in Flyte tasks using inference providers like NIM, Ollama, and others.

Install

pip install flytekitplugins-inference

Quick Start(example, may need adjustment)

pip install flytekitplugins-inference

from flytekit import task, workflow
from flytekitplugins.inference import NIM, NIMSecrets, Model, Ollama

config = NIM(...)

@task
def my_task() -> None:
    ...

@workflow
def my_workflow() -> None:
    my_task()

Available Imports (6)

configNIM

Configuration type for Inference.

extends dataclass — configuration or data structure for plugin setup

from flytekitplugins.inference import NIM

configNIMSecrets

Configuration type for Inference.

extends dataclass — configuration or data structure for plugin setup

from flytekitplugins.inference import NIMSecrets

configModel

Represents the configuration for a model used in a Kubernetes pod template.

extends dataclass — configuration or data structure for plugin setup

from flytekitplugins.inference import Model

typeOllama

Configuration type for Inference.

from flytekitplugins.inference import Ollama

typeVLLM

Configuration type for Inference.

from flytekitplugins.inference import VLLM

configHFSecret

Configuration type for Inference.

extends dataclass — configuration or data structure for plugin setup

from flytekitplugins.inference import HFSecret

Dependencies

kubernetesopenai

Related Plugins

SGLang

Model Serving

Serve large language models using SGLang with Flyte Apps.

vLLM

Model Serving

Serve large language models using vLLM with Flyte Apps.

Dgxc-lepton

Model Serving

A professional Flytekit plugin that enables seamless deployment and management of AI inference endpoints using Lepton AI infrastructure within Flyte workflows.