- Name
- Owen Kephart
- Handle
This is the first in a series of smaller blog posts detailing some of the new features that have been released between 0.12.0 and 0.13.0
Introduction
Logs are critical to understanding the behavior of your data applications. Dagster's event log viewer makes it easy to navigate these logs, and to connect them to the specific computation (job, run, op, etc.) that they relate to.
Some log messages are automatically generated by the Dagster framework, recording information like when steps are started or if a run fails. Other messages are more free-form, often created directly by the writer of a particular job, and give specific insights on the state of the program at a given time. These could include the current status of an external system, the progress of a long-running task, the contents of a particular runtime object, and really any information that might be useful during or after a run.
In prior versions of Dagster, the utility of these logs was somewhat hampered by the interfaces that we exposed to interact with them. The following sections will walk through some of those rough edges, and what we’ve done to improve on them since Dagster 0.12.0.
As always, if you have any features (logging-related or otherwise) that you’re interested in seeing, or just want to get involved, let us know on Slack!
Dagster logging from anywhere
In prior releases, context.log
was the only entry point for creating Dagster log messages. However, you don’t always have easy access to a context
object. For example, if you’re calling functions from within your op, and want those functions to spit out helpful logging information, it can be painful to thread the context
parameter through every single potentially-relevant call.
Now, you can get a Dagster-managed logger anywhere, simply by calling get_dagster_logger()
:
from dagster import get_dagster_logger
def my_logged_function():
log = get_dagster_logger()
for i in range(10):
log.info("Did %d things!", i+1)
Messages produced this way will be treated identically to messages created using context.log
, resulting in a greatly improved experience for this use case.
This feature is also useful when you’re converting existing code into the Dagster framework. A common python logging pattern is to have a line like
logger = logging.getLogger(__name__)
at the top level of a file, and use this logger in all the functions or objects defined therein. If you want your logs to be captured by Dagster (and linked to the relevant computations), you can replace this line with logger = get_dagster_logger()
— no need to modify function signatures to pass around context!
Capturing other python logs
Sometimes, using the new get_dagster_logger()
functionality is not an option. If your log-producing code is impossible or inconvenient for you to edit (for example, if it's imported from an external library), you can't simply swap out the current logger with a Dagster-specific one.
For these cases, you can specify the names of specific python loggers that you’d like to capture inside your dagster.yaml
file:
python_logs:
managed_python_loggers:
- some_external_logger
Doing so applies this setting to all runs launched from your Dagster instance. After this value is set, any log message produced by one of these loggers will be handled identically to a context.log
call. Check out the docs to learn more.
context.log
Standardizing Although we've made it possible to log without a context
object, if you have one handy (which you often will) then calling context.log
is still an easy and convenient way to create log messages.
While the available methods (context.log.info()
, context.log.error()
, etc.) mirror those of the standard python logging.Logger
class, in past versions of Dagster they had subtle implementation differences which were sure to cause confusion if you ran into them. For example, the old implementation would only accept strings as the message parameter, meaning seemingly-natural calls like context.log.debug(123)
would spit out errors. We also did not support the traditional %
-based string formatting options, or allow you to set certain properties on the message object.
Now, context.log
is a subclass of logging.Logger
, so it behaves identically to any other python logger. On the whole, this makes Dagster logging significantly more consistent and intuitive.
### now a valid statement!
context.log.info("Dagster logging is %s", "better!")
Instance-wide log handling
Especially in production environments, it can be useful to send all of your Dagster logs to an external monitoring service such as Datadog or Cloudwatch, or even to a particular local file. Often, this configuration will be the same regardless of the job that you’re running. However, it was previously difficult to set a log handling policy that applied to every single run launched from an instance — your only option was to configure every single job individually, a time-consuming and error-prone process.
Now, we offer a way to define logging handlers that apply to logs from every job run on an instance. In a method similar to the new python log capture feature, you can use your dagster.yaml
file to define any number of standard logging.Handler
s to process the logs from your jobs:
python_logs:
dagster_handler_config:
handlers:
myHandler:
class: logging.FileHandler
level: INFO
filename: 'logs.txt'
mode: 'a'
This feature dramatically simplifies pipelines with custom, uniform logging policies, so if this seems useful to you, you can learn more by checking out the docs.
Wrapping up
These improvements to the logging system make it easier than ever to create, view, and export important information about your jobs.
We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!
Follow us:
Dagster 1.9: Spooky
- Name
- Sandy Ryza
- Handle
- @s_ryz
Dagster 1.8: Call Me Maybe
- Name
- TéJaun RiChard
- Handle
- @tejaun
Announcing Dagster 1.7: Love Plus One
- Name
- Fraser Marlow
- Handle
- @frasermarlow