
With prefect I created a task 'collect machine details for windows', another 'collect machine details for Linux', another 'collect software inventory'. I needed to collect disk size, ram, software inventory, some custom config, if present. We looked into it ~1 year ago or so, I haven't heard a lot about it lately, I wonder if anyone has had success with it at scale. I stood up a prefect deployment for a hackathon and I found that it solved a ton of the issues with airflow (sane deployment options, not the insane file-based polling that airflow does). We'll stay at 2.0 until we eventually move off airflow altogether.
#Helm chart airflow 2.0 upgrade#
We've since tried multiple times to upgrade past the 2.0 release and hit issues every time, so we are just done with it. The scalability improvements in airflow 2 were a boon for our runtimes since before we would often have 5-15 minutes of overhead between task scheduling, but man it was a bear of an upgrade. Upgrades have been an absolute nightmare and so disruptive. Considering we have a pretty simplified DAG structure, I wish we had gone with a simpler, more robust/scalable solution (even if just rolling our own scheduler) for our specific needs. We've hit scaling issues at the k8s level, scheduling overhead in airflow, random race conditions deep in the airflow code, etc. We have had SO many headaches operating airflow over the years, and each time we invest in fixing the issue I feel more and more entrenched.

ingestion = ingest > validate > publish > scrub PII > publish) so we really don't need all the flexibility that airflow provides. Our DAGs are all config-driven which populate a few different templates (e.g. We weren't aware of a great alternative when we started.
.png)
We've also been running airflow for the past 2-3 years at a similar scale (~5000 dags, 100k+ task executions daily) for our data platform.
