Preview Flyte 2 for productionHosted on Union.ai
pytorch

pytorch

flyteplugins-pytorch

Flyte SDK (v2)ML Trainingpytorchdeep-learningdistributedelastic

Union can execute PyTorch distributed training jobs natively on a Kubernetes Cluster, which manages the lifecycle of worker pods, rendezvous coordination, spin-up, and tear down. It leverages the open-sourced TorchElastic (torch.distributed.elastic) launcher and the Kubeflow PyTorch Operator, enabling fault-tolerant and elastic training across multiple nodes.

Install
pip install flyteplugins-pytorch

Quick Start(example, may need adjustment)

See full examples
pip install flyteplugins-pytorch

from flytekit import task, workflow
from flyteplugins.pytorch import Elastic

config = Elastic(...)

@task
def my_task() -> None:
    ...

@workflow
def my_workflow() -> None:
    my_task()

Available Imports (1)

configElastic

Elastic defines the configuration for running a PyTorch elastic job using torch.distributed.

extends dataclass — configuration or data structure for plugin setup

from flyteplugins.pytorch import Elastic

Dependencies

torch

Related Plugins

Package Info

Min Flyte SDK
Modules1

Downloads

Last day56
Last week337
Last month945