Changelog#

1.6.9 (core) / 0.22.8 (libraries)#

New#

[ui] When viewing logs for a run, the date for a single log row is now shown in the tooltip on the timestamp. This helps when viewing a run that takes place over more than one date.
Added suggestions to the error message when selecting asset keys that do not exist as an upstream asset or in an AssetSelection.
Improved error messages when trying to materialize a subset of a multi-asset which cannot be subset.
[dagster-snowflake] dagster-snowflake now requires snowflake-connector-python>=3.4.0
[embedded-elt] @sling_assets accepts an optional name parameter for the underlying op
[dagster-openai] dagster-openai library is now available.
[dagster-dbt] Added a new setting on DagsterDbtTranslatorSettings called enable_duplicate_source_asset_keys that allows users to set duplicate asset keys for their dbt sources. Thanks @hello-world-bfree!
Log messages in the Dagster daemon for unloadable sensors and schedules have been removed.
[ui] Search now uses a cache that persists across pageloads which should greatly improve search performance for very large orgs.
[ui] groups/code locations in the asset graph’s sidebar are now sorted alphabetically.

Bugfixes#

Fixed issue where the input/output schemas of configurable IOManagers could be ignored when providing explicit input / output run config.
Fixed an issue where enum values could not properly have a default value set in a ConfigurableResource.
Fixed an issue where graph-backed assets would sometimes lose user-provided descriptions due to a bug in internal copying.
[auto-materialize] Fixed an issue introduced in 1.6.7 where updates to ExternalAssets would be ignored when using AutoMaterializePolicies which depended on parent updates.
[asset checks] Fixed a bug with asset checks in step launchers.
[embedded-elt] Fix a bug when creating a SlingConnectionResource where a blank keyword argument would be emitted as an environment variable
[dagster-dbt] Fixed a bug where emitting events from dbt source freshness would cause an error.
[ui] Fixed a bug where using the “Terminate all runs” button with filters selected would not apply the filters to the action.
[ui] Fixed an issue where typing a search query into the search box before the search data was fetched would yield “No results” even after the data was fetched.

Community Contributions#

[docs] fixed typo in embedded-elt.mdx (thanks @cameronmartin)!
[dagster-databricks] log the url for the run of a databricks job (thanks @smats0n)!
Fix missing partition property (thanks christeefy)!
Add op_tags to @observable_source_asset decorator (thanks @maxfirman)!
[docs] typo in MultiPartitionMapping docs (thanks @dschafer)
Allow github actions to checkout branch from forked repo for docs changes (ci fix) (thanks hainenber)!

Experimental#

[asset checks] UI performance of asset checks related pages has been improved.
[dagster-dbt] The class DbtArtifacts has been added for managing the behavior of rebuilding the manifest during development but expecting a pre-built one in production.

Documentation#

Added example of writing compute logs to AWS S3 when customizing agent configuration.
"Hello, Dagster" is now "Dagster Quickstart" with the option to use a Github Codespace to explore Dagster.
Improved guides and reference to better running multiple isolated agents with separate queues on ECS.

Dagster Cloud#

Microsoft Teams is now supported for alerts. Documentation
A send sample alert button now exists on both the alert policies page and in the alert policies editor to make it easier to debug and configure alerts without having to wait for an event to kick them off.

1.6.8 (core) / 0.22.8 (libraries)#

Bugfixes#

[dagster-embedded-elt] Fixed a bug in the SlingConnectionResource that raised an error when connecting to a database.

Experimental#

[asset checks] graph_multi_assets with check_specs now support subsetting.

1.6.7 (core) / 0.22.7 (libraries)#

New#

Added a new run_retries.retry_on_op_or_asset_failures setting that can be set to false to make run retries only occur when there is an unexpected failure that crashes the run, allowing run-level retries to co-exist more naturally with op or asset retries. See the docs for more information.
dagster dev now sets the environment variable DAGSTER_IS_DEV_CLI allowing subprocesses to know that they were launched in a development context.
[ui] The Asset Checks page has been updated to show more information on the page itself rather than in a dialog.

Bugfixes#

[ui] Fixed an issue where the UI disallowed creating a dynamic partition if its name contained the “|” pipe character.
AssetSpec previously dropped the metadata and code_version fields, resulting in them not being attached to the corresponding asset. This has been fixed.

Experimental#

The new @multi_observable_source_asset decorator enables defining a set of assets that can be observed together with the same function.
[dagster-embedded-elt] New Asset Decorator @sling_assets and Resource SlingConnectionResource have been added for the [dagster-embedded-elt.sling](http://dagster-embedded-elt.sling) package. Deprecated build_sling_asset, SlingSourceConnection and SlingTargetConnection.
Added support for op-concurrency aware run dequeuing for the QueuedRunCoordinator.

Documentation#

Fixed reference documentation for isolated agents in ECS.
Corrected an example in the Airbyte Cloud documentation.
Added API links to OSS Helm deployment guide.
Fixed in-line pragmas showing up in the documentation.

Dagster Cloud#

Alerts now support Microsoft Teams.
[ECS] Fixed an issue where code locations could be left undeleted.
[ECS] ECS agents now support setting multiple replicas per code server.
[Insights] You can now toggle the visibility of a row in the chart by clicking on the dot for the row in the table.
[Users] Added a new column “Licensed role” that shows the user's most permissive role.

1.6.6 (core) / 0.22.6 (libraries)#

New#

Dagster officially supports Python 3.12.
dagster-polars has been added as an integration. Thanks @danielgafni!
[dagster-dbt] @dbt_assets now supports loading projects with semantic models.
[dagster-dbt] @dbt_assets now supports loading projects with model versions.
[dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
[dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
[UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
[ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
[ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
[dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

Observable source assets can now yield ObserveResults with no data_version.
You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
[ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

[kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
[dagster-k8s] Include k8s pod debug info in run worker failure messages.
[dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
[dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
[instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
[asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
[dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

@observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
[auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
[Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

Fixed an error in our asset checks docs. Thanks @vaharoni!
Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
Fixed an issue on the Hello Dagster! guide that prevented it from loading.
Add specific capabilities of the Airflow integration to the Airflow integration page.
Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

0.13.16#

New#

Added an integration with Airbyte, under the dagster-airbyte package (thanks Marcos Marx).
An op that has a config schema is no longer required to have a context argument.

Bugfixes#

Fixed an issue introduced in 0.13.13 where jobs with DynamicOutputs would fail when using the k8s_job_executor due to a label validation error when creating the step pod.
In Dagit, when searching for asset keys on the Assets page, string matches beyond a certain character threshold on deeply nested key paths were ignored. This has been fixed, and all keys in the asset path are now searchable.
In Dagit, links to Partitions views were broken in several places due to recent URL querystring changes, resulting in page crashes due to JS errors. These links have been fixed.
The “Download Debug File” menu link is fixed on the Runs page in Dagit.
In the “Launch Backfill” dialog on the Partitions page in Dagit, the range input sometimes discarded user input due to page updates. This has been fixed. Additionally, pressing the return key now commits changes to the input.
When using a mouse wheel or touchpad gestures to zoom on a DAG view for a job or graph in Dagit, the zoom behavior sometimes was applied to the entire browser instead of just the DAG. This has been fixed.
Dagit fonts now load correctly when using the --path-prefix option.
Date strings in tool tips on time-based charts no longer duplicate the meridiem indicator.

Experimental#

Software-defined assets can now be partitioned. The @asset decorator has a partitions_def argument, which accepts a PartitionsDefinition value. The asset details page in Dagit now represents which partitions are filled in.

Documentation#

Fixed the documented return type for the sync_and_poll method of the dagster-fivetran resource (thanks Marcos Marx).
Fixed a typo in the Ops concepts page (thanks Oluwashina Aladejubelo).

0.13.14#

New#

When you produce a PartitionedConfig object using a decorator like daily_partitioned_config or static_partitioned_config, you can now directly invoke that object to invoke the decorated function.
The end_offset argument to PartitionedConfig can now be negative. This allows you to define a schedule that fills in partitions further in the past than the current partition (for example, you could define a daily schedule that fills in the partition from two days ago by setting end_offset to -1.
The runConfigData argument to the launchRun GraphQL mutation can now be either a JSON-serialized string or a JSON object , instead of being required to be passed in as a JSON object. This makes it easier to use the mutation in typed languages where passing in unserialized JSON objects as arguments can be cumbersome.
Dagster now always uses the local working directory when resolving local imports in job code, in all workspaces. In the case where you want to use a different base folder to resolve local imports in your code, the working_directory argument can now always be specified (before, it was only available when using the python_file key in your workspace). See the Workspace docs (https://docs.dagster.io/concepts/code-locations/workspace-files#loading-relative-imports) for more information.

Bugfixes#

In Dagit, when viewing an in-progress run, the logic used to render the “Terminate” button was backward: it would appear for a completed run, but not for an in-progress run. This bug was introduced in 0.13.13, and is now fixed.
Previously, errors in the instance’s configured compute log manager would cause runs to fail. Now, these errors are logged but do not affect job execution.
The full set of DynamicOutputs returned by a op are no longer retained in memory if there is no hook to receive the values. This allows for DynamicOutput to be used for breaking up a large data set that can not fit in memory.

Breaking Changes#

When running your own gRPC server to serve Dagster code, jobs that launch in a container using code from that server will now default to using dagster as the entry point. Previously, the jobs would run using PYTHON_EXECUTABLE -m dagster, where PYTHON_EXECUTABLE was the value of sys.executable on the gRPC server. For the vast majority of Dagster jobs, these entry points will be equivalent. To keep the old behavior (for example, if you have multiple Python virtualenvs in your image and want to ensure that runs also launch in a certain virtualenv), you can launch the gRPC server using the new ----use-python-environment-entry-point command-line arg.

Community Contributions#

[bugfix] Fixed an issue where log levels on handlers defined in dagster.yaml would be ignored (thanks @lambdaTW!)

Documentation#

Typo fix in the jobs page (thanks kmiku7 (https://github.com/kmiku7))!
Added docs on how to modify k8s job TTL

UI#

When re-launching a run, the log/step filters are now preserved in the new run’s page.
Step execution times/recent runs now appear in the job/graph sidebar.

0.13.13#

New#

[dagster-dbt] dbt rpc resources now surface dbt log messages in the Dagster event log.
[dagster-databricks] The databricks_pyspark_step_launcher now streams Dagster logs back from Databricks rather than waiting for the step to completely finish before exporting all events. Fixed an issue where all events from the external step would share the same timestamp. Immediately after execution, stdout and stderr logs captured from the Databricks worker will be automatically surfaced to the event log, removing the need to set the wait_for_logs option in most scenarios.
[dagster-databricks] The databricks_pyspark_step_launcher now supports dynamically mapped steps.
If the scheduler is unable to reach a code server when executing a schedule tick, it will now wait until the code server is reachable again before continuing, instead of marking the schedule tick as failed.
The scheduler will now check every 5 seconds for new schedules to run, instead of every 30 seconds.
The run viewer and workspace pages of Dagit are significantly more performant.
Dagit loads large (100+ node) asset graphs faster and retrieves information about the assets being rendered only.
When viewing an asset graph in Dagit, you can now rematerialize the entire graph by clicking a single “Refresh” button, or select assets to rematerialize them individually. You can also launch a job to rebuild an asset directly from the asset details page.
When viewing a software-defined asset, Dagit displays its upstream and downstream assets in two lists instead of a mini-graph for easier scrolling and navigation. The statuses of these assets are updated in real-time. This new UI also resolves a bug where only one downstream asset would appear.

Bugfixes#

Fixed bug where execute_in_process would not work for graphs with nothing inputs.
In the Launchpad in Dagit, the Ctrl+A command did not correctly allow select-all behavior in the editor for non-Mac users, this has now been fixed.
When viewing a DAG in Dagit and hovering on a specific input or output for an op, the connections between the highlighted inputs and outputs were too subtle to see. These are now a bright blue color.
In Dagit, when viewing an in-progress run, a caching bug prevented the page from updating in real time in some cases. For instance, runs might appear to be stuck in a queued state long after being dequeued. This has been fixed.
Fixed a bug in the k8s_job_executor where the same step could start twice in rare cases.
Enabled faster queries for the asset catalog by migrating asset database entries to store extra materialization data.
[dagster-aws] Viewing the compute logs for in-progress ops for instances configured with the S3ComputeLogManager would cause errors in Dagit. This is now fixed.
[dagster-pandas] Fixed bug where Pandas categorical dtype did not work by default with dagster-pandas categorical_column constraint.
Fixed an issue where schedules that yielded a SkipReason from the schedule function did not display the skip reason in the tick timeline in Dagit, or output the skip message in the dagster-daemon log output.
Fixed an issue where the snapshot link of a finished run in Dagit would sometimes fail to load with a GraphQL error.
Dagit now supports software-defined assets that are defined in multiple jobs within a repo, and displays a warning when assets in two repos share the same name.

Breaking Changes#

We previously allowed schedules to be defined with cron strings like @daily rather than 0 0 * * *. However, these schedules would fail to actually run successfully in the daemon and would also cause errors when viewing certain pages in Dagit. We now raise an DagsterInvalidDefinitionError for schedules that do not have a cron expression consisting of a 5 space-separated fields.

Community Contributions#

In dagster-dask, a schema can now be conditionally specified for ops materializing outputs to parquet files, thank you @kudryk!
Dagster-gcp change from @AndreaGiardini that replaces get_bucket() calls with bucket(), to avoid unnecessary bucket metadata fetches, thanks!
Typo fix from @sebastianbertoli, thank you!
[dagster-k8s] Kubernetes jobs and pods created by Dagster now have labels identifying the name of the Dagster job or op they are running. Thanks @skirino!

Experimental#

[dagit] Made performance improvements for loading the asset graph.
[dagit] The debug console logging output now tracks calls to fetch data from the database, to help track inefficient queries.

0.13.12#

New#

The dagit and dagster-daemon processes now use a structured Python logger for command-line output.
Dagster command-line logs now include the system timezone in the logging timestamp.
When running your own Dagster gRPC code server, the server process will now log a message to stdout when it starts up and when it shuts down.
[dagit] The sensor details page and sensor list page now display links to the assets tracked by @asset_sensors.
[dagit] Improved instance warning in Dagit. Previously, Dagit showed an instance warning for daemon not running when no repos have schedulers or sensors.
[dagster-celery-k8s] You can now specify volumes and volume mounts to runs using the CeleryK8sRunLauncher that will be included in all launched jobs.
[dagster-databricks] You are no longer required to specify storage configuration when using the databricks_pyspark_step_launcher.
[dagster-databricks] The databricks_pyspark_step_launcher can now be used with dynamic mapping and collect steps.
[dagster-mlflow] The end_mlflow_on_run_finished hook is now a top-level export of the dagster mlflow library. The API reference also now includes an entry for it.

Bugfixes#

Better backwards-compatibility for fetching asset keys materialized from older versions of dagster.
Fixed an issue where jobs running with op subsets required some resource configuration as part of the run config, even when they weren’t required by the selected ops.
RetryPolicy is now respected when execution is interrupted.
[dagit] Fixed "Open in Playground" link on the scheduled ticks.
[dagit] Fixed the run ID links on the Asset list view.
[dagit] When viewing an in-progress run, the run status sometimes failed to update as new logs arrived, resulting in a Gantt chart that either never updated from a “queued” state or did so only after a long delay. The run status and Gantt chart now accurately match incoming logs.

Community Contributions#

[dagster-k8s] Fixed an issue where specifying job_metadata in tags did not correctly propagate to Kubernetes jobs created by Dagster. Thanks @ibelikov!

Experimental#

[dagit] Made performance improvements for loading the asset graph.

Documentation#

The Versioning and Memoization guide has been updated to reflect a new set of core memoization APIs.
[dagster-dbt] Updated the dagster-dbt integration guide to mention the new dbt Cloud integration.
[dagster-dbt] Added documentation for the default_flags property of DbtCliResource.

0.13.11#

New#

[dagit] Made performance improvements to the Run page.
[dagit] Highlighting a specific sensor / schedule ticks is now reflected in a shareable URL.

Bugfixes#

[dagit] On the Runs page, when filtering runs with a tag containing a comma, the filter input would incorrectly break the tag apart. This has been fixed.
[dagit] For sensors that do not target a specific job (e.g. un_status_sensor, we are now hiding potentially confusing Job details
[dagit] Fixed an issue where some graph explorer views generated multiple scrollbars.
[dagit] Fixed an issue with the Run view where the Gantt view incorrectly showed in-progress steps when the run had exited.
[dagster-celery-k8s] Fixed an issue where setting a custom Celery broker URL but not a custom Celery backend URL in the helm chart would produce an incorrect Celery configuration.
[dagster-k8s] Fixed an issue where Kubernetes volumes using list or dict types could not be set in the Helm chart.

Community Contributions#

[dagster-k8s] Added the ability to set a custom location name when configuring a workspace in the Helm chart. Thanks @pcherednichenko!

Experimental#

[dagit] Asset jobs now display with spinners on assets that are currently in progress.
[dagit] Assets jobs that are in progress will now display a dot icon on all assets that are not yet running but will be re-materialized in the run.
[dagit] Fixed broken links to the asset catalog entries from the explorer view of asset jobs.
The AssetIn input object now accepts an asset key so upstream assets can be explicitly specified (e.g. AssetIn(asset_key=AssetKey("asset1")))
The @asset decorator now has an optional non_argument_deps parameter that accepts AssetKeys of assets that do not pass data but are upstream dependencies.
ForeignAsset objects now have an optional description attribute.

Documentation#

“Validating Data with Dagster Type Factories” guide added.

You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.9 (core) / 0.22.8 (libraries)#

New#

Bugfixes#

Community Contributions#

Experimental#

Documentation#

Dagster Cloud#

1.6.8 (core) / 0.22.8 (libraries)#

Bugfixes#

Experimental#

1.6.7 (core) / 0.22.7 (libraries)#

New#

Bugfixes#

Experimental#

Documentation#

Dagster Cloud#

1.6.6 (core) / 0.22.6 (libraries)#

New#

Bugfixes#

Experimental#

Documentation#

Dagster Cloud#

1.6.5 (core) / 0.22.5 (libraries)#

New#

Bugfixes#

Experimental#

Documentation#

0.13.16#

New#

Bugfixes#

Experimental#

Documentation#

0.13.14#

New#

Bugfixes#

Breaking Changes#

Community Contributions#

Documentation#

UI#

0.13.13#

New#

Bugfixes#

Breaking Changes#

Community Contributions#

Experimental#

0.13.12#

New#

Bugfixes#

Community Contributions#

Experimental#

Documentation#

0.13.11#

New#

Bugfixes#

Community Contributions#

Experimental#

Documentation#