You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.9 (core) / 0.22.8 (libraries)#

New#

  • [ui] When viewing logs for a run, the date for a single log row is now shown in the tooltip on the timestamp. This helps when viewing a run that takes place over more than one date.
  • Added suggestions to the error message when selecting asset keys that do not exist as an upstream asset or in an AssetSelection.
  • Improved error messages when trying to materialize a subset of a multi-asset which cannot be subset.
  • [dagster-snowflake] dagster-snowflake now requires snowflake-connector-python>=3.4.0
  • [embedded-elt] @sling_assets accepts an optional name parameter for the underlying op
  • [dagster-openai] dagster-openai library is now available.
  • [dagster-dbt] Added a new setting on DagsterDbtTranslatorSettings called enable_duplicate_source_asset_keys that allows users to set duplicate asset keys for their dbt sources. Thanks @hello-world-bfree!
  • Log messages in the Dagster daemon for unloadable sensors and schedules have been removed.
  • [ui] Search now uses a cache that persists across pageloads which should greatly improve search performance for very large orgs.
  • [ui] groups/code locations in the asset graph’s sidebar are now sorted alphabetically.

Bugfixes#

  • Fixed issue where the input/output schemas of configurable IOManagers could be ignored when providing explicit input / output run config.
  • Fixed an issue where enum values could not properly have a default value set in a ConfigurableResource.
  • Fixed an issue where graph-backed assets would sometimes lose user-provided descriptions due to a bug in internal copying.
  • [auto-materialize] Fixed an issue introduced in 1.6.7 where updates to ExternalAssets would be ignored when using AutoMaterializePolicies which depended on parent updates.
  • [asset checks] Fixed a bug with asset checks in step launchers.
  • [embedded-elt] Fix a bug when creating a SlingConnectionResource where a blank keyword argument would be emitted as an environment variable
  • [dagster-dbt] Fixed a bug where emitting events from dbt source freshness would cause an error.
  • [ui] Fixed a bug where using the “Terminate all runs” button with filters selected would not apply the filters to the action.
  • [ui] Fixed an issue where typing a search query into the search box before the search data was fetched would yield “No results” even after the data was fetched.

Community Contributions#

  • [docs] fixed typo in embedded-elt.mdx (thanks @cameronmartin)!
  • [dagster-databricks] log the url for the run of a databricks job (thanks @smats0n)!
  • Fix missing partition property (thanks christeefy)!
  • Add op_tags to @observable_source_asset decorator (thanks @maxfirman)!
  • [docs] typo in MultiPartitionMapping docs (thanks @dschafer)
  • Allow github actions to checkout branch from forked repo for docs changes (ci fix) (thanks hainenber)!

Experimental#

  • [asset checks] UI performance of asset checks related pages has been improved.
  • [dagster-dbt] The class DbtArtifacts has been added for managing the behavior of rebuilding the manifest during development but expecting a pre-built one in production.

Documentation#

  • Added example of writing compute logs to AWS S3 when customizing agent configuration.
  • "Hello, Dagster" is now "Dagster Quickstart" with the option to use a Github Codespace to explore Dagster.
  • Improved guides and reference to better running multiple isolated agents with separate queues on ECS.

Dagster Cloud#

  • Microsoft Teams is now supported for alerts. Documentation
  • A send sample alert button now exists on both the alert policies page and in the alert policies editor to make it easier to debug and configure alerts without having to wait for an event to kick them off.

1.6.8 (core) / 0.22.8 (libraries)#

Bugfixes#

  • [dagster-embedded-elt] Fixed a bug in the SlingConnectionResource that raised an error when connecting to a database.

Experimental#

  • [asset checks] graph_multi_assets with check_specs now support subsetting.

1.6.7 (core) / 0.22.7 (libraries)#

New#

  • Added a new run_retries.retry_on_op_or_asset_failures setting that can be set to false to make run retries only occur when there is an unexpected failure that crashes the run, allowing run-level retries to co-exist more naturally with op or asset retries. See the docs for more information.
  • dagster dev now sets the environment variable DAGSTER_IS_DEV_CLI allowing subprocesses to know that they were launched in a development context.
  • [ui] The Asset Checks page has been updated to show more information on the page itself rather than in a dialog.

Bugfixes#

  • [ui] Fixed an issue where the UI disallowed creating a dynamic partition if its name contained the “|” pipe character.
  • AssetSpec previously dropped the metadata and code_version fields, resulting in them not being attached to the corresponding asset. This has been fixed.

Experimental#

  • The new @multi_observable_source_asset decorator enables defining a set of assets that can be observed together with the same function.
  • [dagster-embedded-elt] New Asset Decorator @sling_assets and Resource SlingConnectionResource have been added for the [dagster-embedded-elt.sling](http://dagster-embedded-elt.sling) package. Deprecated build_sling_asset, SlingSourceConnection and SlingTargetConnection.
  • Added support for op-concurrency aware run dequeuing for the QueuedRunCoordinator.

Documentation#

  • Fixed reference documentation for isolated agents in ECS.
  • Corrected an example in the Airbyte Cloud documentation.
  • Added API links to OSS Helm deployment guide.
  • Fixed in-line pragmas showing up in the documentation.

Dagster Cloud#

  • Alerts now support Microsoft Teams.
  • [ECS] Fixed an issue where code locations could be left undeleted.
  • [ECS] ECS agents now support setting multiple replicas per code server.
  • [Insights] You can now toggle the visibility of a row in the chart by clicking on the dot for the row in the table.
  • [Users] Added a new column “Licensed role” that shows the user's most permissive role.

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

1.5.4 / 0.21.4 (libraries)#

New#

  • Added a report_asset_check REST API endpoint for runless external asset check evaluation events. This is available in cloud as well.
  • The config argument is now supported on @graph_multi_asset
  • [ui] Improved performance for global search UI, especially for deployments with very large numbers of jobs or assets.
  • [dagster-pipes] Add S3 context injector/reader.
  • [dagster-dbt] When an exception when running a dbt command, error messages from the underlying dbt invocation are now properly surfaced to the Dagster exception.
  • [dagster-dbt] The path to the dbt executable is now configurable in DbtCliResource.

Bugfixes#

  • Fixed a bug introduced in 1.5.3 that caused errors when launching specific Ops in a Job.
  • Fixed a bug introduced in 1.5.0 that prevented the AssetExecutionContext type annotation for the context parameter in @asset_check functions.
  • Fixed an issue where the Dagster scheduler would sometimes fail to retry a tick if there was an error reloading a code location in the middle of the tick.
  • [dagster-dbt] Fixed an issue where explicitly passing in profiles_dir=None into DbtCliResource would cause incorrect validation.
  • [dagster-dbt] Fixed an issue where partial parsing was not working when reusing existing target paths in subsequent dbt invocations.
  • [ui] Fixed an issue where the job partitions UI would show “0 total partitions” if the job consisted of more than 100 assets

Community Contributions#

  • [dagster-duckdb] The DuckDBResource and DuckDBIOManager accept a connection_config configuration that will be passed as config to the DuckDB connection. Thanks @xjhc!

Experimental#

  • Added events in the run log when a step is blocked by a global op concurrency limit.
  • Added a backoff for steps querying for open concurrency slots.
  • Auto-materialize logic to skip materializing when (1) a backfill is in progress or (2) parent partitions are required but nonexistent are now refactored to be skip rules.
  • [ui] Added 2 new asset graph layout algorithms under user settings that are significantly faster for large graphs (1000+ assets).

Documentation#

Dagster Cloud#

  • Running multiple agents is no longer considered experimental.
  • When the agent spins up a new code server while updating a code location, it will now wait until the new code location uploads any changes to Dagster Cloud before allowing the new server to serve requests.

1.5.3 / 0.21.3 (libraries)#

New#

  • Alert policies can now be set on assets + asset checks (currently experimental). Check out the alerting docs for more information.
  • Added a new flag --live-data-poll-rate that allows configuring how often the UI polls for new asset data when viewing the asset graph, asset catalog, or overview assets page. It defaults to 2000 ms.
  • Added back the ability to materialize changed and missing assets from the global asset-graph. A dialog will open allowing you to preview and select which assets to materialize.
  • Added an experimental AMP Timeline page to give more visibility into the automaterialization daemon. You can enable it under user settings
  • Added a report_asset_materialization REST API endpoint for creating external asset materialization events. This is available in cloud as well.
  • [dbt] The @dbt_assets decorator now accepts a backfill_policy argument, for controlling how the assets are backfilled.
  • [dbt] The @dbt_assets decorator now accepts a op_tags argument, for passing tags to the op underlying the produced AssetsDefinition.
  • [pipes] Added get_materialize_result & get_asset_check_result to PipesClientCompletedInvocation
  • [dagster-datahub] The acryl-datahub pin in the dagster-datahub package has been removed.
  • [dagster-databricks] The PipesDatabricksClient now performs stdout/stderr forwarding from the Databricks master node to Dagster.
  • [dagster-dbt] The hostname of the dbt API can now be configured when executing the dagster-dbt-cloud CLI.
  • [dagster-k8s] Added the ability to customize how raw k8s config tags set on an individual Dagster job are merged with raw k8s config set on the K8sRunLauncher. See the docs for more information.

Bugfixes#

  • Previously, the asset backfill page would display negative counts if failed partitions were manually re-executed. This has been fixed.

  • Fixed an issue where the run list dialog for viewing the runs occupying global op concurrency slots did not expand to fit the content size.

  • Fixed an issue where selecting a partition would clear the launchpad and typing in the launchpad would clear the partition selection

  • Fixed various issues with the asset-graph displaying the wrong graph

  • The IO manager’s handle_output method is no longer invoked when observing an observable source asset.

  • [ui] Fixed an issue where the run config dialog could not be scrolled.

  • [pipes] Fixed an issue in the PipesDockerClient with parsing logs fetched via the docker client.

  • [external assets] Fixed an issue in external_assets_from_specs where providing multiple specs would error

  • [external assets] Correct copy in tooltip to explain why Materialize button is disabled on an external asset.

Breaking Changes#

  • [pipes] A change has been made to the environment variables used to detect if the external process has been launched with pipes. Update the dagster-pipes version used in the external process.
  • [pipes] The top level function is_dagster_pipes_process has been removed from the dagster-pipes package.

Community Contributions#

  • Override a method in the azure data lake IO manager (thanks @0xfabioo)!
  • Add support of external launch types in ECS run launcher (thanks @cuttius)!

Experimental#

  • The Python GraphQL client is considered stable and is no longer marked as experimental.

1.5.2 / 0.21.2 (libraries)#

Bugfixes#

  • Previously, asset backfills targeting assets with multi-run backfill policies would raise a "did not submit all run requests" error. This has been fixed.

Dagster Cloud#

  • The experimental dagster-insights package has receieved some API surface area updates and bugfixes.

1.5.1 / 0.21.1 (libraries)#

New#

  • Dagster now automatically infers a dependency relationship between a time-partitioned asset and a multi-partitioned asset with a time dimension. Previously, this was only inferred when the time dimension was the same in each asset.
  • The EnvVar utility will now raise an exception if it is used outside of the context of a Dagster resource or config class. The get_value() utility will retrieve the value outside of this context.
  • [ui] The runs page now displays a “terminate all” button at the top, to bulk terminate in-progress runs.
  • [ui] Asset Graph - Various performance improvements that make navigating large asset graphs smooth
  • [ui] Asset Graph - The graph now only fetches data for assets within the viewport solving timeout issues with large asset graphs
  • [ui] Asset Graph Sidebar - The sidebar now shows asset status
  • [dagster-dbt] When executing dbt invocations using DbtCliResource, an explicit target_path can now be specified.
  • [dagster-dbt] Asset checks can now be enabled by using DagsterDbtTranslator and DagsterDbtTranslatorSettings: see the docs for more information.
  • [dagster-embedded-elt] Dagster library for embedded ELT

Bugfixes#

  • [ui] Fixed various issues on the asset details page where partition names would overflow outside their containers
  • [ui] Backfill notification - Fixed an issue where the backfill link didn’t take the —path-prefix option into account
  • [ui] Fixed an issue where the instance configuration yaml would persist rendering even after navigating away from the page.
  • [ui] Fixed issues where config yaml displays could not be scrolled.
  • [dagster-webserver] Fixed a performance issue that caused the UI to load slowly

Deprecations#

  • [dagster-dbt] Enabling asset checks using dbt project metadata has been deprecated.

1.5.0 (core) / 0.21.0 (libraries) "How Will I Know"#

Major Changes since 1.4.0 (core) / 0.20.0 (libraries)#

Core#

  • Improved ergonomics for execution dependencies in assets  - We introduced a set of APIs to simplify working with Dagster that don't use the I/O manager system for handling data between assets. I/O manager workflows will not be affected.

    • AssetDep type allows you to specify upstream dependencies with partition mappings when using the deps parameter of @asset and AssetSpec.
    • MaterializeResult can be optionally returned from an asset to report metadata about the asset when the asset handles any storage requirements within the function body and does not use an I/O manager.
    • AssetSpec has been added as a new way to declare the assets produced by @multi_asset. When using AssetSpec, the multi_asset does not need to return any values to be stored by the I/O manager. Instead, the multi_asset should handle any storage requirements in the body of the function.
  • Asset checks (experimental) - You can now define, execute, and monitor data quality checks in Dagster [docs].

    • The @asset_check decorator, as well as the check_specs argument to @asset and @multi_asset enable defining asset checks.
    • Materializing assets from the UI will default to executing their asset checks. You can also execute individual checks.
    • When viewing an asset in the asset graph or the asset details page, you can see whether its checks have passed, failed, or haven’t run successfully.
  • Auto materialize customization (experimental) - AutoMaterializePolicies can now be customized [docs].

    • All policies are composed of a set of AutoMaterializeRules which determine if an asset should be materialized or skipped.
    • To modify the default behavior, rules can be added to or removed from a policy to change the conditions under which assets will be materialized.

dagster-pipes#

  • Dagster pipes is a new library that implements a protocol for launching compute into external execution environments and consuming streaming logs and Dagster metadata from those environments. See https://github.com/dagster-io/dagster/discussions/16319 for more details on the motivation and vision behind Pipes.
  • Out-the-box integrations
    • Clients: local subprocess, Docker containers, Kubernetes, and Databricks
      • PipesSubprocessClient, PipesDocketClient, PipesK8sClient, PipesDatabricksClient
    • Transport: Unix pipes, Filesystem, s3, dbfs
    • Languages: Python
  • Dagster pipes is composable with existing launching infrastructure via open_pipes_session. One can augment existing invocations rather than replacing them wholesale.

Since 1.4.17 (core) / 0.20.17 (libraries)#

New#

  • [ui] Global Asset Graph performance improvement - the first time you load the graph it will be cached to disk and any subsequent load of the graph should load instantly.

Bugfixes#

  • Fixed a bug where deleted runs could retain instance-wide op concurrency slots.

Breaking Changes#

  • AssetExecutionContext is now a subclass of OpExecutionContext, not a type alias. The code
def my_helper_function(context: AssetExecutionContext):
    ...

@op
def my_op(context: OpExecutionContext):
    my_helper_function(context)

will cause type checking errors. To migrate, update type hints to respect the new subclassing.

  • AssetExecutionContext cannot be used as the type annotation for @ops run in @jobs. To migrate, update the type hint in @op to OpExecutionContext. @ops that are used in @graph_assets may still use the AssetExecutionContext type hint.
# old
@op
def my_op(context: AssetExecutionContext):
    ...

# correct
@op
def my_op(context: OpExecutionContext):
    ...
  • [ui] We have removed the option to launch an asset backfill as a single run. To achieve this behavior, add backfill_policy=BackfillPolicy.single_run() to your assets.

Community Contributions#

  • has_dynamic_partition implementation has been optimized. Thanks @edvardlindelof!
  • [dagster-airbyte] Added an optional stream_to_asset_map argument to build_airbyte_assets to support the Airbyte prefix setting with special characters. Thanks @chollinger93!
  • [dagster-k8s] Moved “labels” to a lower precedence. Thanks @jrouly!
  • [dagster-k8s] Improved handling of failed jobs. Thanks @Milias!
  • [dagster-databricks] Fixed an issue where DatabricksPysparkStepLauncher fails to get logs when job_run doesn’t have cluster_id at root level. Thanks @PadenZach!
  • Docs type fix from @sethusabarish, thank you!

Documentation#

  • Our Partitions documentation has gotten a facelift! We’ve split the original page into several smaller pages, as follows:

Dagster Cloud#

  • New dagster-insights sub-module - We have released an experimental dagster_cloud.dagster_insights module that contains utilities for capturing and submitting external metrics about data operations to Dagster Cloud via an api. Dagster Cloud Insights is a soon-to-be released feature that shows improves visibility into usage and cost metrics such as run duration and Snowflake credits in the Cloud UI.