In this module we will learn how to set a private
PyPi repository with Amazon MWAA. We will use AWS CodeArtifact for our code repository. This will also enable you to avoid providing MWAA with an internet access via
NAT Gateway to install required dependencies and hence reduce the cost of overall infrastructure. You will also be able to leverage AWS CodeArtifact repository to publish your private libraries.
The solution that we will deploy includes the following AWS services:
AWS Lambda runs every 10 hours to obtain the authorization token for AWS CodeArtifact. This token is then used to create an index-url for
PyPi remote repository in CodeArtifact. Generated
index-url is saved to
codeartifact.txt file that is then uploaded to an Amazon S3 bucket. MWAA fetches DAGs and
codeartifact.txt at the runtime, connects to CodeArtifact repository and installs Python dependencies.
Diagram below shows an architectural overview of this solution:
Now, let’s build this!