This is the Flyte SDK (v2) version of this plugin. The Flytekit version is available as flytekitplugins-spark.
Spark
flyteplugins-spark
Union can execute Spark jobs natively on a Kubernetes Cluster, which manages a virtual cluster’s lifecycle, spin-up, and tear down. It leverages the open-sourced Spark On K8s Operator and can be enabled without signing up for any service. This is like running a transient spark cluster — a type of cluster spun up for a specific Spark job and torn down after completion.
pip install flyteplugins-sparkQuick Start(example, may need adjustment)
See full examplespip install flyteplugins-spark
from flytekit import task, workflow
from flyteplugins.spark import Spark
config = Spark(...)
@task
def my_task() -> None:
...
@workflow
def my_workflow() -> None:
my_task()Available Imports (1)
Use this to configure a SparkContext for a your task.
extends dataclass — configuration or data structure for plugin setup
from flyteplugins.spark import Spark
Dependencies
Related Plugins
Spark
Flyte can execute Spark jobs natively on a Kubernetes Cluster, which manages a virtual cluster’s lifecycle, spin-up, and tear down. It leverages the open-sourced Spark On K8s Operator and can be enabled without signing up for any service. This is like running a transient spark cluster — a type of cluster spun up for a specific Spark job and torn down after completion.
Dask
Flyte can execute dask jobs natively on a Kubernetes Cluster, which manages the virtual dask cluster's lifecycle
Dask
Flyte can execute dask jobs natively on a Kubernetes Cluster, which manages the virtual dask cluster's lifecycle
Kubeflow MPI
This plugin uses the Kubeflow MPI Operator and provides an extremely simplified interface for executing distributed training.