On this submit, we exhibit automating deployment of Amazon Managed Workflows for Apache Airflow (Amazon MWAA) utilizing customer-managed endpoints in a VPC, offering compatibility with shared, or in any other case restricted, VPCs.
Information scientists and engineers have made Apache Airflow a number one open supply software to create knowledge pipelines as a result of its energetic open supply neighborhood, acquainted Python improvement as Directed Acyclic Graph (DAG) workflows, and intensive library of pre-built integrations. Amazon MWAA is a managed service for Airflow that makes it simple to run Airflow on AWS with out the operational burden of getting to handle the underlying infrastructure. For every Airflow atmosphere, Amazon MWAA creates a single-tenant service VPC, which hosts the metadatabase that shops states and the online server that gives the person interface. Amazon MWAA additional manages Airflow scheduler and employee situations in a customer-owned and managed VPC, as a way to schedule and run duties that work together with buyer assets. These Airflow containers within the buyer VPC entry assets within the service VPC by way of a VPC endpoint.
Many organizations select to centrally handle their VPC utilizing AWS Organizations, permitting a VPC in an proprietor account to be shared with assets in a distinct participant account. Nonetheless, as a result of creating a brand new route exterior of a VPC is taken into account a privileged operation, participant accounts can’t create endpoints in proprietor VPCs. Moreover, many shoppers don’t wish to prolong the safety privileges required to create VPC endpoints to all customers provisioning Amazon MWAA environments. Along with VPC endpoints, prospects additionally want to prohibit knowledge egress by way of Amazon Easy Queue Service (Amazon SQS) queues, and Amazon SQS entry is a requirement within the Amazon MWAA structure.
Shared VPC assist for Amazon MWAA provides the power so that you can handle your personal endpoints inside your VPCs, including compatibility to shared and in any other case restricted VPCs. Specifying customer-managed endpoints additionally gives the power to satisfy strict safety insurance policies by explicitly limiting VPC useful resource entry to only these wanted by your Amazon MWAA environments. This submit demonstrates how customer-managed endpoints work with Amazon MWAA and gives examples of learn how to automate the provisioning of these endpoints.
Answer overview
Shared VPC assist for Amazon MWAA permits a number of AWS accounts to create their Airflow environments into shared, centrally managed VPCs. The account that owns the VPC (proprietor) shares the 2 personal subnets required by Amazon MWAA with different accounts (members) that belong to the identical group from AWS Organizations. After the subnets are shared, the members can view, create, modify, and delete Amazon MWAA environments within the subnets shared with them.
When customers specify the necessity for a shared, or in any other case policy-restricted, VPC throughout atmosphere creation, Amazon MWAA will first create the service VPC assets, then enter a pending state for as much as 72 hours, with an Amazon EventBridge notification of the change in state. This permits homeowners to create the required endpoints on behalf of members based mostly on endpoint service info from the Amazon MWAA console or API, or programmatically by way of an AWS Lambda operate and EventBridge rule, as within the instance on this submit.
After these endpoints are created on the proprietor account, the endpoint service within the single-tenant Amazon MWAA VPC will detect the endpoint connection occasion and resume atmosphere creation. Ought to there be a problem, you possibly can cancel atmosphere creation by deleting the atmosphere throughout this pending state.
This characteristic additionally lets you take away the create, modify, and delete VPCE privileges from the AWS Id and Entry Administration (IAM) principal creating Amazon MWAA environments, even when not utilizing a shared VPC, as a result of that permission will as a substitute be imposed on the IAM principal creating the endpoint (the Lambda operate in our instance). Moreover, the Amazon MWAA atmosphere will present the SQS queue Amazon Useful resource Identify (ARN) utilized by the Airflow Celery Executor to queue duties (the Celery Executor Queue), permitting you to explicitly enter these assets into your community coverage quite than having to offer a extra open and generalized permission.
On this instance, we create the VPC and Amazon MWAA atmosphere in the identical account. For shared VPCs throughout accounts, the EventBridge rule and Lambda operate would exist within the proprietor account, and the Amazon MWAA atmosphere can be created within the participant account. See Sending and receiving Amazon EventBridge occasions between AWS accounts for extra info.
Conditions
It is best to have the next conditions:
- An AWS account
- An AWS person in that account, with permissions to create VPCs, VPC endpoints, and Amazon MWAA environments
- An Amazon Easy Storage Service (Amazon S3) bucket in that account, with a folder referred to as
dags
Create the VPC
We start by making a restrictive VPC utilizing an AWS CloudFormation template, as a way to simulate creating the required VPC endpoint and modifying the SQS endpoint coverage. If you wish to use an current VPC, you possibly can proceed to the subsequent part.
- On the AWS CloudFormation console, select Create stack and select With new assets (normal).
- Beneath Specify template, select Add a template file.
- Now we edit our CloudFormation template to limit entry to Amazon SQS. In
cfn-vpc-private-bjs.yml
, edit theSqsVpcEndoint
part to look as follows:
This extra coverage doc entry prevents Amazon SQS egress to any useful resource not explicitly listed.
Now we will create our CloudFormation stack.
- On the AWS CloudFormation console, select Create stack.
- Choose Add a template file.
- Select Select file.
- Browse to the file you modified.
- Select Subsequent.
- For Stack title, enter
MWAA-Surroundings-VPC
. - Select Subsequent till you attain the overview web page.
- Select Submit.
Create the Lambda operate
We’ve got two choices for self-managing our endpoints: guide and automatic. On this instance, we create a Lambda operate that responds to the Amazon MWAA EventBridge notification. You possibly can additionally use the EventBridge notification to ship an Amazon Easy Notification Service (Amazon SNS) message, comparable to an e mail, to somebody with permission to create the VPC endpoint manually.
First, we create a Lambda operate to reply to the EventBridge occasion that Amazon MWAA will emit.
- On the Lambda console, select Create operate.
- For Identify, enter
mwaa-create-lambda
. - For Runtime, select Python 3.11.
- Select Create operate.
- For Code, within the Code supply part, for
lambda_function
, enter the next code: - Select Deploy.
- On the Configuration tab of the Lambda operate, within the Common configuration part, select Edit.
- For Timeout, increate to five minutes, 0 seconds.
- Select Save.
- Within the Permissions part, beneath Execution position, select the position title to edit the permissions of this operate.
- For Permission insurance policies, select the hyperlink beneath Coverage title.
- Select Edit and add a comma and the next assertion:
The whole coverage ought to look much like the next:
- Select Subsequent till you attain the overview web page.
- Select Save modifications.
Create an EventBridge rule
Subsequent, we configure EventBridge to ship the Amazon MWAA notifications to our Lambda operate.
- On the EventBridge console, select Create rule.
- For Identify, enter mwaa-create.
- Choose Rule with an occasion sample.
- Select Subsequent.
- For Creation methodology, select Consumer sample type.
- Select Edit sample.
- For Occasion sample, enter the next:
- Select Subsequent.
- For Choose a goal, select Lambda operate.
You may additionally specify an SNS notification as a way to obtain a message when the atmosphere state changes.
- For Operate, select
mwaa-create-lambda
. - Select Subsequent till you attain the ultimate part, then select Create rule.
Create an Amazon MWAA atmosphere
Lastly, we create an Amazon MWAA atmosphere with customer-managed endpoints.
- On the Amazon MWAA console, select Create atmosphere.
- For Identify, enter a singular title on your atmosphere.
- For Airflow model, select the newest Airflow model.
- For S3 bucket, select Browse S3 and select your S3 bucket, or enter the Amazon S3 URI.
- For DAGs folder, select Browse S3 and select the
dags/
folder in your S3 bucket, or enter the Amazon S3 URI. - Select Subsequent.
- For Digital Personal Cloud, select the VPC you created earlier.
- For Internet server entry, select Public community (Web accessible).
- For Safety teams, deselect Create new safety group.
- Select the shared VPC safety group created by the CloudFormation template.
As a result of the safety teams of the AWS PrivateLink endpoints from the sooner step are self-referencing, it’s essential to select the identical safety group on your Amazon MWAA atmosphere.
- For Endpoint administration, select Buyer managed endpoints.
- Maintain the remaining settings as default and select Subsequent.
- Select Create atmosphere.
When your atmosphere is on the market, you possibly can entry it by way of the Open Airflow UI hyperlink on the Amazon MWAA console.
Clear up
Cleansing up assets that aren’t actively getting used reduces prices and is a finest follow. If you happen to don’t delete your assets, you possibly can incur further costs. To scrub up your assets, full the next steps:
- Delete your Amazon MWAA atmosphere, EventBridge rule, and Lambda operate.
- Delete the VPC endpoints created by the Lambda operate.
- Delete any safety teams created, if relevant.
- After the above assets have accomplished deletion, delete the CloudFormation stack to make sure that you’ve gotten eliminated the entire remaining assets.
Abstract
This submit described learn how to automate atmosphere creation with shared VPC assist in Amazon MWAA. This offers you the power to handle your personal endpoints inside your VPC, including compatibility to shared, or in any other case restricted, VPCs. Specifying customer-managed endpoints additionally gives the power to satisfy strict safety insurance policies by explicitly limiting VPC useful resource entry to only these wanted by their Amazon MWAA environments. To study extra about Amazon MWAA, seek advice from the Amazon MWAA Consumer Information. For extra posts about Amazon MWAA, go to the Amazon MWAA assets web page.
In regards to the writer
John Jackson has over 25 years of software program expertise as a developer, techniques architect, and product supervisor in each startups and huge firms and is the AWS Principal Product Supervisor chargeable for Amazon MWAA.