Model Serving

Model inference, serving, and deployment · 9 plugins

Dgxc-lepton

flytekitplugins-dgxc-lepton

A professional Flytekit plugin that enables seamless deployment and management of AI inference endpoints using Lepton AI infrastructure within Flyte workflows.

11 modules

251/mo

1.9.1+

Inference

Flytekit

flytekitplugins-inference

Serve models natively in Flyte tasks using inference providers like NIM, Ollama, and others.

6 modules

1.13.0+

ONNX PyTorch

Flytekit

flytekitplugins-onnxpytorch

This plugin allows you to generate ONNX models from your PyTorch models.

2 modules

1.3.0+

ONNX ScikitLearn

Flytekit

flytekitplugins-onnxscikitlearn

This plugin allows you to generate ONNX models from your ScikitLearn models.

2 modules

ONNX TensorFlow

Flytekit

flytekitplugins-onnxtensorflow

This plugin allows you to generate ONNX models from your TensorFlow Keras models.

2 modules

1.3.0+

OpenAI

Flytekit

flytekitplugins-openai

The plugin currently features ChatGPT and Batch API connectors.

9 modules

1.10.7+

OpenAI

v2Flyte SDK (v2)

flyteplugins-openai

This plugin provides a drop-in replacement for OpenAI packages. It provides

1 module

992/mo

SGLang

v2Flyte SDK (v2)

flyteplugins-sglang

Serve large language models using SGLang with Flyte Apps.

2 modules

vLLM

v2Flyte SDK (v2)

flyteplugins-vllm

Serve large language models using vLLM with Flyte Apps.

2 modules