IAM Roles

  • Go to the IAM Console - Roles

  • Search for the Airflow Instance role, which looks similar to AmazonMWAA-airflow-xxxx-instance-xxxx

  • Let’s attach the following permissions to the Airflow Instance role

    • AWSGlueConsoleFullAccess
    • AmazonElasticMapReduceFullAccess (In case you don’t find this, please attach the AmazonEMRFullAccessPolicy_v2 policy instead )
    • AmazonS3FullAccess
    • AmazonSageMakerFullAccess

      In a production environment, when you create IAM policies, follow the standard security advice of granting least privilege, or granting only the permissions required to perform a task. Determine what roles need to do and then craft policies that allow them to perform only those tasks. Read more - here

Amazon IAM setup


Next, we will create a Glue service role to run the crawler and the jobs. To create an IAM role for AWS Glue

  • Go to the IAM Console - Roles

  • Choose Create role

  • For role type, choose AWS Service, find and choose Glue, and choose Next: Permissions

  • On the Attach permissions policy page, choose

    • AWS managed policy AWSGlueServiceRole for general AWS Glue permissions
    • AWS managed policy AmazonS3FullAccess for access to Amazon S3 resources
  • Then choose Next: Tags and then Next: Review.

  • For Role name, enter AWSGlueServiceRoleDefault and Choose Create Role

Note down the Role ARN for both the roles, which looks something like this arn:aws:iam::1111111111111111:role/AWSGlueServiceRoleDefault


Next, we will create a SageMaker service role to be used in the ML pipeline module. To create an IAM role for Amazon SageMaker

  • Go to the IAM Console - Roles

  • Choose Create role

  • For role type, choose AWS Service, find and choose SageMaker, and choose Next: Permissions

  • On the Attach permissions policy page, choose (if not already selected)

    • AWS managed policy AmazonSageMakerFullAccess
    • AWS managed policy AmazonS3FullAccess for access to Amazon S3 resources
  • Then choose Next: Tags and then Next: Review.

  • For Role name, enter AirflowSageMakerExecutionRole and Choose Create Role


Finally, we will create the EMR default IAM roles to be used to create the EMR cluster through Airflow. To create the default IAM roles for Amazon EMR

  • Go to the IAM Console - Roles

  • Choose Create role

  • For role type, choose AWS Service, find and choose EMR

  • Under Select your use case, click on EMR and click Next: Permissions

  • On the Attach permissions policy page, choose (if not already selected)

    • AWS managed policy AmazonElasticMapReduceRole
    • AWS managed policy AmazonS3FullAccess
  • Then choose Next: Tags and then Next: Review.

  • For Role name, enter EMR_DefaultRole and Choose Create Role
    (If you see a message - A role named “EMR_DefaultRole” already exists then you can Cancel this step)

  • Click on Create Role again

  • For role type, choose AWS Service, find and choose EMR

  • Under Select your use case, click on EMR Role for EC2 and click Next: Permissions

  • On the Attach permissions policy page, choose (if not already selected)

    • AWS managed policy AmazonElasticMapReduceforEC2Role
    • AWS managed policy AmazonS3FullAccess
  • Then choose Next: Tags and then Next: Review.

  • For Role name, enter EMR_EC2_DefaultRole and Choose Create Role
    (If you see a message - A role named “EMR_EC2_DefaultRole” already exists then you can Cancel this step)


That’s it. Now the Airflow instance will be able to run Glue crawlers and Jobs, and also perform EMR operations with the execution role.