vLLM
flyteplugins-vllm
Flyte SDK (v2)Model Servingvllminferencellmservinggpu
Serve large language models using vLLM with Flyte Apps.
Install
pip install flyteplugins-vllmQuick Start(example, may need adjustment)
pip install flyteplugins-vllm
from flytekit import task, workflow
from flyteplugins.vllm import DEFAULT_VLLM_IMAGE, VLLMAppEnvironment
config = DEFAULT_VLLM_IMAGE(...)
@task
def my_task() -> None:
...
@workflow
def my_workflow() -> None:
my_task()Available Imports (2)
typeDEFAULT_VLLM_IMAGE
Configuration type for vLLM.
from flyteplugins.vllm import DEFAULT_VLLM_IMAGE
configVLLMAppEnvironment
App environment backed by vLLM for serving large language models.
extends dataclass — configuration or data structure for plugin setup
from flyteplugins.vllm import VLLMAppEnvironment
Related Plugins
SGLang
Serve large language models using SGLang with Flyte Apps.
Inference
Serve models natively in Flyte tasks using inference providers like NIM, Ollama, and others.
Dgxc-lepton
A professional Flytekit plugin that enables seamless deployment and management of AI inference endpoints using Lepton AI infrastructure within Flyte workflows.
OpenAI
The plugin currently features ChatGPT and Batch API connectors.
Package Info
Min Flyte SDK
Modules2
Downloads
Last day58
Last week265
Last month893