About dagster-ray
The community-supported dagster-ray
package allows orchestrating distributed Ray compute from Dagster pipelines. This integration includes RayRunLauncher, ray_executor, RayIOManager, and PipesRayJobClient for distributed computing workflows.
`dagster-ray` enables working with distributed Ray compute from Dagster pipelines, combining Dagster's excellent orchestration capabilities and Ray's distributed computing power together.
Learn more in `dagster-ray` docs
Installation
pip install dagster dagster-ray
# For KubeRay support:
pip install dagster 'dagster-ray[kuberay]'
Example
from dagster import job, op, asset, Definitions
from dagster_ray import ray_executor, RayIOManager
@op(
tags={
"dagster-ray/config": {
"num_cpus": 8,
"num_gpus": 2,
"runtime_env": {"pip": {"packages": ["torch"]}},
}
}
)
def my_ray_op():
import torch
# your expensive computation here
result = ...
return result
@asset(io_manager_key="ray_io_manager")
def ray_asset() -> int:
return 42
@job(executor_def=ray_executor.configured({"ray": {"num_cpus": 1}}))
def my_ray_job():
return my_ray_op()
defs = Definitions(
assets=[ray_asset],
jobs=[my_ray_job],
resources={
"ray_io_manager": RayIOManager(),
}
)
About Ray
Ray is a unified framework for scaling AI and Python applications. It provides a simple, universal API for building distributed applications, enabling you to parallelize single machine code with minimal code changes. Ray is used for distributed computing, machine learning, and hyperparameter tuning at scale.