
The most widely used airflow executor is the Celery executor, consisting of three components: a message broker (usually Redis), a result backend, and the worker nodes. In this article, we are aiming to explore the latter use case. The most popular use case for Airflow is ETL pipelines, automated reporting, automated backups, and Machine Learning Operations. In airflow, tasks are written in python, and they are organized into Directed Acyclic Graphs (DAGs), where the relationships and dependencies between tasks are defined. The platform takes care of authoring, scheduling, monitoring, and alerting workflows while taking care of connections to external systems such as databases, cloud services, Hadoop, etc. Machine learning operations aim to unify and automate the whole lifecycle of a machine learning model, including the data acquisition, feature engineering, the model development, -validation, -deployment, -monitoring, and retraining, the rapid and automated testing of models, the introduction of new data sources and features while providing an easy, automated way for serving, monitoring, and validating machine learning models.Īirbnb initially developed Apache Airflow to manage the company's increasingly more complex workflows for data engineering pipelines. To fulfill these needs, we introduced machine learning operations for this customer with the Airflow framework during this project. The most important characteristics required from this infrastructure were the maintainability, observability, and easy enrichment of models with new features while providing an easy way for automated deployment and rapid machine learning model development cycles. Since these ML models were the first production-grade AI applications for this customer, there was a need not just for the ML models but for machine learning infrastructure too.

During one of our projects, we helped a customer optimize their customer experience processes by introducing machine learning models for the classification of incoming customer support emails and a chatbot for 24/7 available support.
