Release

New in Dagster 0.13.0: Logging Improvements!

Logging without context, instance-wide handlers, capturing python logs, and more! Learn about the improvements we've made to Dagster logging since 0.12.0.

November 8, 20212 minute read

Owen Kephart
Name
Owen Kephart
Handle

This is the first in a series of smaller blog posts detailing some of the new features that have been released between 0.12.0 and 0.13.0

Introduction

Logs are critical to understanding the behavior of your data applications. Dagster's event log viewer makes it easy to navigate these logs, and to connect them to the specific computation (job, run, op, etc.) that they relate to.

Some log messages are automatically generated by the Dagster framework, recording information like when steps are started or if a run fails. Other messages are more free-form, often created directly by the writer of a particular job, and give specific insights on the state of the program at a given time. These could include the current status of an external system, the progress of a long-running task, the contents of a particular runtime object, and really any information that might be useful during or after a run.

The Dagster event log, viewed in Dagit.
A mixture of automatically and manually generated logs in the Dagster event log. Each message is linked to the step that it came from.

In prior versions of Dagster, the utility of these logs was somewhat hampered by the interfaces that we exposed to interact with them. The following sections will walk through some of those rough edges, and what we’ve done to improve on them since Dagster 0.12.0.

As always, if you have any features (logging-related or otherwise) that you’re interested in seeing, or just want to get involved, let us know on Slack!

Dagster logging from anywhere

In prior releases, context.log was the only entry point for creating Dagster log messages. However, you don’t always have easy access to a context object. For example, if you’re calling functions from within your op, and want those functions to spit out helpful logging information, it can be painful to thread the context parameter through every single potentially-relevant call.

Now, you can get a Dagster-managed logger anywhere, simply by calling get_dagster_logger():

from dagster import get_dagster_logger

def my_logged_function():
    log = get_dagster_logger()
    for i in range(10):
        log.info("Did %d things!", i+1)

Messages produced this way will be treated identically to messages created using context.log, resulting in a greatly improved experience for this use case.

This feature is also useful when you’re converting existing code into the Dagster framework. A common python logging pattern is to have a line like

logger = logging.getLogger(__name__)

at the top level of a file, and use this logger in all the functions or objects defined therein. If you want your logs to be captured by Dagster (and linked to the relevant computations), you can replace this line with logger = get_dagster_logger() — no need to modify function signatures to pass around context!

Capturing other python logs

Sometimes, using the new get_dagster_logger() functionality is not an option. If your log-producing code is impossible or inconvenient for you to edit (for example, if it's imported from an external library), you can't simply swap out the current logger with a Dagster-specific one.

For these cases, you can specify the names of specific python loggers that you’d like to capture inside your dagster.yaml file:

python_logs:
  managed_python_loggers:
    - some_external_logger
Configuration to treat logs produced by some_external_logger as if they had come from context.log

Doing so applies this setting to all runs launched from your Dagster instance. After this value is set, any log message produced by one of these loggers will be handled identically to a context.log call. Check out the docs to learn more.

Standardizing context.log

Although we've made it possible to log without a context object, if you have one handy (which you often will) then calling context.log is still an easy and convenient way to create log messages.

While the available methods (context.log.info(), context.log.error(), etc.) mirror those of the standard python logging.Logger class, in past versions of Dagster they had subtle implementation differences which were sure to cause confusion if you ran into them. For example, the old implementation would only accept strings as the message parameter, meaning seemingly-natural calls like context.log.debug(123) would spit out errors. We also did not support the traditional %-based string formatting options, or allow you to set certain properties on the message object.

Now, context.log is a subclass of logging.Logger, so it behaves identically to any other python logger. On the whole, this makes Dagster logging significantly more consistent and intuitive.

# now a valid statement!
context.log.info("Dagster logging is %s", "better!")

Instance-wide log handling

Especially in production environments, it can be useful to send all of your Dagster logs to an external monitoring service such as Datadog or Cloudwatch, or even to a particular local file. Often, this configuration will be the same regardless of the job that you’re running. However, it was previously difficult to set a log handling policy that applied to every single run launched from an instance — your only option was to configure every single job individually, a time-consuming and error-prone process.

Now, we offer a way to define logging handlers that apply to logs from every job run on an instance. In a method similar to the new python log capture feature, you can use your dagster.yaml file to define any number of standard logging.Handlers to process the logs from your jobs:

python_logs:
  dagster_handler_config:
    handlers:
      myHandler:
        class: logging.FileHandler
        level: INFO
        filename: 'logs.txt'
        mode: 'a'
Configuration to send all log messages produced during a run to logs.txt

This feature dramatically simplifies workflows with custom, uniform logging policies, so if this seems useful to you, you can learn more by checking out the docs.

Wrapping up

These improvements to the logging system make it easier than ever to create, view, and export important information about your jobs. We’re always happy to hear your feedback, so don’t hesitate to reach out with questions or comments!


Read more filed under
Release