Search Tutorials


Top Apache Airflow (2024) Interview Questions | JavaInUse

Top Apache Airflow frequently asked interview questions.

In this post we will look at Apache Airflow Interview questions. Examples are provided with explanations.

  1. What is Apache Airflow?
  2. What are the features of Apache Airflow?
  3. How does Apache Airflow acts as Solution?
  4. What are the basic concepts in Apache Airflow?
  5. What are some Airflow dependencies?
  6. What are some of the integrations in Airflow?
  7. What is Command line in Airflow?
  8. How do we create a new DAG?
  9. How can we restart Airflow webserver?
  10. How can we add logs to Airflow Logs?

What is Apache Airflow?

Apache Airflow helps in authoring, scheduling, and monitoring all Data Pipelines. It began in 2014 with the Umbrella at Airbnb and is an open-source that is achieved by DAG(Directed Acyclic Graphs).

Airflow


What are the features of Apache Airflow?

Features of Apache Airflow are:
  • Airflow helps us to schedule all of the jobs and its historical status.
  • Airflow helps us to view Directed Acyclic Graphs and its relation dependencies.
  • Airflow helps in supporting executions by using Web UI and CRUD Operations on DAG.

How does Apache Airflow acts as Solution?

Airflow solves problems like:
Failures - Airflow helps in retrying if any failure happens.
Monitoring - Airflow helps in checking if the status fails or success.
Dependency - There are 2 types of dependencies:
Data Dependencies - helps in upstreaming of data.
Execution Dependencies - helps in deploying new changes.
Scalability - helps in centralizing the scheduler.
Deployment - helps in deploying changes easily.
Processing Historic Data - helps in backfilling historical data.

What are the basic concepts in Apache Airflow?


Airflow

Airflow consists of 4 concepts:
  • DAG - acts as a description of the order used for work.
  • Operator - acts as a Template for carrying out work.
  • Task - acts as a parameterized instance.
  • Task Instance - acts as a task which is assigned to a DAG.

What are some Airflow dependencies?

Some Dependencies in airflow are as follows:
freetds-bin \
krb5-user \
ldap-utils \
libffi6 \
libsasl2-2 \
libsasl2-modules \
locales \
lsb-release \
sasl2-bin \
sqlite3 \




What are some of the integrations in Airflow?

Some of the Integrations in Airflow are as follows:
  • Apache Plg
  • Kubernets
  • AWS Glue
  • Azure Data Lake
  • Hadoop
  • Amazon S3
  • Amazon EMR

What is Command line in Airflow?

Apache Airflow runs from the command line.There are major commands all the users need to know:
Airflow run - used for running a task.
airflow task - used for debugging a task.
airflow backfill - used for running a part of DAG.
airflow webserver - used for starting the GUI.
airflow show DAG - used for showing tasks and its dependencies.

How do we create a new DAG?

We can create new DAG by 2 methods:

Airflow


How can we restart Airflow webserver?

We can restart airflow webserver by using data pipelines and start the backend process by using the following command:
airflow webserver -p 8080 -B true


How can we run bash script file in Airflow?

We can run bash script file by using the following command:
create_command = """
 ./scripts/create_file.sh
"""
t1 = BashOperator(
        task_id= 'create_file',
        bash_command=create_command,
        dag=dag
)


How can we add logs to Airflow Logs?

We can add logs by using logging module and also by using the following command:
import
dag = xx

def print_params_fn(**KKA):
    import logging
    logging.info(KKA)
    return None

print_params = PythonOperator(task_id="print_params",
                              python_callable=print_params_fn,
                              provide_context=True,
                              dag=dag)


See Also

Spring Boot Interview Questions Apache Camel Interview Questions Drools Interview Questions Java 8 Interview Questions Enterprise Service Bus- ESB Interview Questions. JBoss Fuse Interview Questions Angular 2 Interview Questions