Dagster Integration:
Dagster + MLflow
Streamline the process of productionizing, maintaining and monitoring machine learning models.
About this integration
With dagster-mlflow
, you can initialize an MLflow run and use it for all steps within a Dagster run. Additionally, you can access all of MLflow’s methods as well as the MLflow tracking client’s methods.
Installation
pip install dagster-mlflow
Example
from dagster_mlflow import end_mlflow_on_run_finished, mlflow_tracking
from dagster import op, job
@op(required_resource_keys={"mlflow"})
def mlflow_op(context):
context.mlflow.log_params(some_params)
context.mlflow.tracking.MlflowClient().create_registered_model(some_model_name)
mlflow = mlflow_tracking.configured({
"experiment_name": "my_experiment",
"mlflow_tracking_uri": "http://localhost:5000",
# if want to run a nested run, provide parent_run_id
"parent_run_id": "an_existing_mlflow_run_id",
# env variables to pass to mlflow
"env": {
"MLFLOW_S3_ENDPOINT_URL": "my_s3_endpoint",
"AWS_ACCESS_KEY_ID": "my_aws_key_id",
"AWS_SECRET_ACCESS_KEY": "my_secret",
},
# env variables you want to log as mlflow tags
"env_to_tag": ["DOCKER_IMAGE_TAG"],
# key-value tags to add to your experiment
"extra_tags": {"super": "experiment"},
})
@end_mlflow_on_run_finished
@job(resource_defs={"mlflow": mlflow})
def mlf_example():
mlflow_op()
About MLflow
MLflow is an open source platform for managing the end-to-end machine learning lifecycle.