Dagster + PySpark
About this integration
This resource provides access to a PySpark SparkSession for executing PySpark code within Dagster.
pip install dagster-pyspark
with_pyspark_emr example project.
PySpark is the Python API for Apache Spark, a distributed framework and set of libraries for real-time, large-scale data processing. PySpark allows you to create more scalable analyses and data pipelines.