Preview Flyte 2 for productionHosted on Union.ai

This is the Flyte SDK (v2) version of this plugin. The Flytekit version is available as flytekitplugins-spark.

Spark

Spark

flyteplugins-spark

Flyte SDK (v2)ML Trainingsparkpysparkdistributedbig-data

Union can execute Spark jobs natively on a Kubernetes Cluster, which manages a virtual cluster’s lifecycle, spin-up, and tear down. It leverages the open-sourced Spark On K8s Operator and can be enabled without signing up for any service. This is like running a transient spark cluster — a type of cluster spun up for a specific Spark job and torn down after completion.

Install
pip install flyteplugins-spark

Quick Start(example, may need adjustment)

See full examples
pip install flyteplugins-spark

from flytekit import task, workflow
from flyteplugins.spark import Spark

config = Spark(...)

@task
def my_task() -> None:
    ...

@workflow
def my_workflow() -> None:
    my_task()

Available Imports (1)

configSpark

Use this to configure a SparkContext for a your task.

extends dataclass — configuration or data structure for plugin setup

from flyteplugins.spark import Spark

Dependencies

pyspark

Related Plugins

Package Info

Min Flyte SDK
Modules1

Downloads

Last day123
Last week730
Last month2,934