In May 2023, I took the dbt Analytics Engineering Certification Exam. The exam is designed to test one’s ability to “build, test, and maintain models to make data accessible to others” and “use dbt to apply engineering principles to analytics infrastructure.” One of the questions that puzzled me at the time was related to the topic of Directed Acyclic Graph (DAG) execution within multi-thread environments.

The Scenarios

The premise of the question went along the lines of: Picture a DAG with a root node named model_alpha and two directed edges leading to two nodes, model_beta_a and model_beta_b. Each of these nodes connects to its own set of leaf nodes, model_gamma_a and model_gamma_b.

2023-09-05-dbt-dag-thread-img01

Line graph of the described DAG


The question posed was: In a scenario where this DAG is run using two threads, with the fail fast argument enabled, what would happen if model_beta_a encounters an error?

To tackle this question, we’ll explore three scenarios using the hypothetical DAG, aiming to observe the differences between running it with a single thread and running it with two threads. The three scenarios are:

  1. Smooth Run: Executing the DAG without any errors.
  2. Error Run: Running the DAG where model_beta_a encounters a division-by-zero error.
  3. Fail-Fast Error Run: Executing the DAG with model_beta_a encountering a division-by-zero error and the fail-fast argument enabled.

All three scenarios are conducted using dbt core 1.6.1 with the bigquery plugin 1.6.4. For simplicity, all models use materialization of type table.

Some Background

In the context of dbt, a thread represents a path of ordered commands. The number of threads determines the maximum number of paths dbt will concurrently process when executing a DAG. You can specify the number of threads in the profiles.yml file or directly through the dbt command using the --threads argument.

Once the order and the number of paths are determined, dbt proceeds with the compilation and execution of each node. Compilation involves assembling queries and syntax error checking, followed by execution of the compiled query on the target database.

Results and metadata from each dbt run are display in the command-line interface (CLI) and also stored in the target/run_results.json file. This file is updated with each successful run.

Smooth Run

In this scenario, we initiate the DAG run with the command dbt run --threads 1.

CLI results from dbt run --threads 1 within the smooth run scenario


The execution proceeds as expected, with all models being ordered, compiled, and executed in succession. The CLI provides a summary of time results in seconds, while the target folder contains more detailed timing information.

2023-09-05-dbt-dag-thread-img02

Model execution order, threads and status according to results from the smooth run scenario with one threads


One observation to be made is that model compilation doesn’t begin until the preceding model’s execution is completed.

2023-09-05-dbt-dag-thread-img04

Timeline of compilation and execution for beta models from the smooth run scenario with one thread


Now, let’s compare this to the two-thread run with dbt run --threads 2.

CLI results from dbt run --threads 2 within the smooth run scenario


As expected, running the DAG with two threads results in a slightly shorter execution time compared to a single-thread run.

2023-09-05-dbt-dag-thread-img03

Model execution order, threads and status according to results from the smooth run scenario with two threads


Under the two-thread case, two observations are worth mentioning. Firstly, despite using separate threads and identical queries, both beta models start and end at different times.

2023-09-05-dbt-dag-thread-img05

Timeline of compilation and execution for beta models from the smooth run scenario with two threads


Secondly, unlike the single-thread case, compilation and execution in the two-thread case occur concurrently for the beta models.

Error Run

In this scenario, we intentionally introduce a division-by-zero error in model_beta_a to force an error during the DAG run. We then execute the command dbt run --threads 1.

CLI results from dbt run --threads 1 within the error run scenario


As expected, model_beta_a is compiled and executed but fails to materialize due to the error. As a result of its dependency, model_gamma_a is skipped, and materialization doesn’t occur. Consistent with single-thread execution, compilation of model_beta_b starts after model_beta_a’s execution ends.

2023-09-05-dbt-dag-thread-img06

Model execution order, threads and status according to target results from the error run scenario with one thread


Now, let’s run the same DAG with two threads with the command dbt run --threads 2.

CLI results from dbt run --threads 2 within the error run scenario


Similar to the single-thread case, model_beta_a and model_gamma_a do not materialize in the two-thread case. Also, compilation and execution of the beta models occur concurrently, just as in the smooth run.

2023-09-05-dbt-dag-thread-img09

Model execution order, threads and status according to target results from the error run scenario with two threads


Fail-Fast Error Run

We now introduce the fail-fast argument into our DAG run using the command dbt run --threads 1 --fail-fast.

Our aim is to analyze results from both the CLI and the target folder, specifically focusing on what happens to the remaining models after model_beta_a encounters an error.

CLI results from dbt run --threads 1 --fail-fast within the fail-fast error run scenario


At this point the results become interesting. In the single-thread case, there are discrepancies between the CLI and target results.

As seen in the CLI results, model_beta_a encounters an error and fails to materialize. Surprisingly, despite the single-thread setup, and the fact that models are compiled and executed successively, model_beta_b is given a green light to run.

Then, model_beta_b cannot be stopped, as stated in the CLI: “The bigquery adapter does not support query cancellation.” In the end, the model is materialized.

This story contrasts with the target results. A quick inspection indicates that model_beta_b is skipped “due to fail fast.” Naturally, since the model was skipped, there is no record of compilation and execution times.

2023-09-05-dbt-dag-thread-img07

Model execution order, threads and status according to target results from the fail-fast error run scenario with one thread


Lastly, we run dbt run --threads 2 --fail-fast.

CLI results from dbt run --threads 2 --fail-fast within the fail-fast error run scenario


Similar to the single-thread case, the CLI reports that model_beta_b is triggered and materialized. Yet, once again, the target results suggest this model was skipped.

2023-09-05-dbt-dag-thread-img08

Model execution order, threads and status according to target results from the fail-fast error run scenario with two threads


Conclusion

In summary, we see that the execution of a DAG respects dependencies. If an upstream model fails, downstream models are skipped, regardless of the number of threads. However, the execution of a DAG does not consistently respect the fail fast argument.

So, what is the answer to the original question? The answer is, it depends.

In theory, models running in parallel in a two-thread environment should be canceled and marked as skipped.

Yet, in practice, whether a model running in parallel is run and materialized depends on the dbt adapter’s support for query cancelation. In the case of the BigQuery adapter, it does not seem to support query cancelation. Thus models running in parallel could still be materialized.

Understanding DAGs in dbt: Threads, Errors and Failing Fast
Older post

Taking snapshots using dbt in Azure Synapse: The Hash Distribution Issue

Learn how to troubleshoot and resolve the "Hash Distributed Table" error when taking snapshots using dbt in Azure Synapse Dedicated Pool

Newer post

A Runner's Tale of Plantar Fasciitis

The story of a rollercoaster ride dealing with foot inflammation and my journey back to running

Understanding DAGs in dbt: Threads, Errors and Failing Fast