>_ Analyst Engineering

Airflow vs Dagster: Orchestrating Tasks vs Orchestrating Assets

Written by Ahmed at Analyst Engineering, a Senior Technical Business Analyst with 10+ years in banking and payments delivery.

Cover comparing Apache Airflow and Dagster data orchestrators.

Key takeaways

  • Airflow and Dagster both decide what runs when in a data platform. Airflow models the work as tasks in scheduled DAGs; Dagster models the outcome as software-defined assets, the tables and files the tasks produce.
  • The task view answers 'did the job run?'; the asset view answers 'is the data fresh and correct?', which is the question the business actually asks.
  • Airflow, created at Airbnb in 2014 and now an Apache project, is the industry default with the largest ecosystem of providers and operators. Dagster is the strongest rethink, with lineage, freshness, and data-aware scheduling built into the asset model.
  • For an analyst, the practical difference is diagnosability: an asset graph shows which tables are stale and why, while a task view makes you translate from failed jobs to affected data yourself.

Airflow and Dagster both answer the same operational question, what runs, when, in what order, and what happens on failure. The philosophical split is what they model: Airflow orchestrates the tasks, Dagster orchestrates the data assets the tasks produce, and that changes which questions the tool can answer for you.

Airflow and Dagster are orchestrators: the schedulers that run a data platform’s work in the right order, retry what fails, and show you what happened. Apache Airflow, created at Airbnb in 2014 to manage its pipeline sprawl and open-sourced a year later, models that work as DAGs of tasks: run the extract, then the load, then trigger dbt, each a unit of work with a schedule. Dagster, the strongest of the newer rethinks, models the same platform as software-defined assets: declarations that fct_payments exists, how it is computed, and what it depends on, with execution derived from the asset graph. The distinction sounds subtle and is not. A task view answers “did last night’s job run?” An asset view answers “is the settlement table fresh, and if not, which upstream failure made it stale?”, which is the question the business was actually asking when they paged you.

Airflow vs Dagster at a glance

DimensionAirflowDagster
Core abstractionTasks in scheduled DAGsSoftware-defined assets
The graph showsJobs and their orderTables and files and their dependencies
Primary question answeredDid the job run and succeed?Is each data asset materialized and fresh?
SchedulingTime-based schedules per DAGSchedules plus freshness and data-aware triggers
dbt integrationRuns dbt as tasks (e.g. via Cosmos)Maps each dbt model to an asset natively
LineageTask lineage; data lineage needs other toolsAsset lineage built in
Origin and governanceCreated at Airbnb 2014, Apache projectDagster Labs, open core
EcosystemThe largest: providers, operators, managed servicesSmaller, growing, analytics-focused

What does the asset model actually change?

It changes what is observable, and observability is most of what an orchestrator is for. In an asset world, the platform’s state is a graph of data objects, each with a last-materialized time, a freshness policy, and upstream and downstream edges. When ingestion fails at 3am, the asset graph shows the blast radius directly: these six tables are stale, these dashboards depend on them, everything else is fine. That is the walk-the-lineage investigation with the walking done for you, and it is why Dagster’s dbt integration feels natural: each dbt model becomes an asset, so transformation lineage and orchestration lineage are one picture instead of two tools’ worth.

In a task world, the same 3am failure shows you a red task in a DAG, and translating from “task extract_payments failed” to “which tables are stale and who is affected” is work you do in your head, with tribal knowledge about what each task feeds. Airflow’s model is not wrong, plenty of orchestrated work genuinely is task-shaped: send the file to the regulator, rotate the credentials, call the API. It is that for analytics platforms specifically, where nearly everything exists to keep tables fresh, the asset is the truer unit, and modeling tasks means maintaining the task-to-data mapping yourself.

Why does Airflow remain the default anyway?

For the same reasons defaults usually hold: ecosystem, operational track record, and people. A decade of production use means there is a provider package for every system you will ever touch, a managed offering on every cloud (Google Cloud Composer, Amazon MWAA, Astronomer), an answer on the internet for every failure mode, and engineers who already know it in every hiring pool. Those are not tie-breakers; for many organizations they are the decision, and an existing Airflow estate that works is rarely worth migrating on architectural preference, the same regression risk calculus as any working system.

The honest guidance runs on two axes. Greenfield analytics platform, dbt-centric, team that cares about data-aware orchestration: Dagster starts ahead, because its model matches the job. Existing estate, heterogeneous task-shaped workloads, or a team hired for the default: Airflow’s gravity wins, and its newer releases keep narrowing the gaps. Either way, evaluate like an analyst rather than a fan: pick one real pipeline, including its failure and backfill scenarios, and run it through both, because orchestration tools reveal their character in the failure paths, not the happy-path demo.

The takeaway

Airflow orchestrates tasks: the most mature, most deployed scheduler in data, with an ecosystem that answers every question except “which tables did that failure make stale?” Dagster orchestrates assets: the data objects themselves, with lineage, freshness, and dbt-native integration built into the model, at the cost of a smaller ecosystem. Choose the asset model when your platform exists to keep tables fresh, choose the incumbent when gravity and task-shaped work dominate, and test both on your failure scenarios rather than their demos.

About the author

Analyst Engineering is written by Ahmed, a Senior Technical Business Analyst with 10+ years of banking and payments delivery experience: ISO 20022 and SWIFT messaging, payments API integration, Kafka event validation, and production support. Every article comes from real delivery work, and each one is reviewed and updated as tools and standards change.

Newsletter

Subscribe

Practical, no-fluff playbooks for technical analysts who analyze, code, test, and support. New articles straight to your inbox.

No spam. Unsubscribe anytime.