Wednesday, June 7, 2023
HomeBig DataIntroducing in-place model upgrades with Amazon MWAA

Introducing in-place model upgrades with Amazon MWAA


At the moment, AWS is asserting the provision of in-place model upgrades for Amazon Managed Workflow for Apache Airflow (Amazon MWAA). This enhancement means that you can seamlessly improve your current Apache Airflow model 2.x environments to newer accessible variations whereas retaining the workflow run historical past and surroundings configurations. Now you can benefit from the newest capabilities of the Apache Airflow platform with out having to create a wholly new Amazon MWAA surroundings.

Till now, when you wished to improve your Amazon MWAA surroundings to a special Apache Airflow model, you needed to comply with the Amazon MWAA surroundings migration directions. This concerned creating a brand new Amazon MWAA surroundings after which migrating all your configurations and Directed Acyclic Graphs (DAGs) to it. Should you additionally wanted to protect the historical past of DAG runs, you needed to take a backup of your metadata database after which restore that backup on the newly created surroundings. This course of was error inclined, handbook, and concerned extra prices to take care of two separate Amazon MWAA environments till you possibly can confirm the brand new and decommission the outdated.

On this submit, we offer an outline of the in-place model improve function, discover relevant use instances, element the steps to make use of it, and supply extra steering on its capabilities.

Overview of answer

The newly launched in-place model upgrades by Amazon MWAA present a streamlined transition out of your current Apache Airflow model 2.x-based environments to newer accessible Apache Airflow variations. Amazon MWAA manages the complete improve course of, from provisioning new Apache Airflow variations to upgrading the metadata database. Within the occasion of an improve failure, Amazon MWAA is designed to roll again to the earlier steady model utilizing the related metadata database snapshot.

Upgrading your current environments on Amazon MWAA is a simple course of. You possibly can improve your current Apache Airflow 2.0 and later environments on Amazon MWAA with only a few clicks on the Amazon MWAA console, by utilizing the Amazon MWAA API, the AWS Command Line Interface (AWS CLI), or by utilizing instruments like AWS CloudFormation, the AWS Cloud Growth Equipment (AWS CDK), or Terraform. This function is offered in all at present supported Amazon MWAA Areas.

On the Amazon MWAA console, merely edit the surroundings and choose an accessible Apache Airflow model larger than the present model of your current surroundings. You can even use the UpdateEnvironment API and specify the brand new Apache Airflow model to set off an improve course of. To study extra about in-place model upgrades, consult with Upgrading the Apache Airflow model from Amazon MWAA documentation.

Throughout an improve, Amazon MWAA first creates a snapshot of the prevailing surroundings’s metadata database, which then serves as the idea for a brand new database. Subsequently, all Apache Airflow elements—net server, scheduler, and staff—are upgraded. Lastly, the newly created metadata database is upgraded, successfully finishing the transition to the brand new surroundings.

Relevant use instances

You must contemplate upgrading your Apache Airflow model on Amazon MWAA in case your current workflows can accommodate the change and a brand new model is offered with options or enhancements that align together with your use case. By upgrading, you possibly can benefit from the newest capabilities of the Apache Airflow platform and preserve compatibility with new options and finest practices like data-driven scheduling and new Amazon supplier packages launched in Apache Airflow 2.4.3. The improve course of entails an surroundings downtime that may take as much as 2 hours to finish relying on the surroundings measurement and may be carried out on demand at a time that most accurately fits you. In case your current surroundings is closely used such that you would be able to’t afford a downtime, contemplate creating a brand new surroundings as an alternative.

Stipulations

When getting ready for the improve, ensure you full the next prerequisite steps:

  1. Confirm Apache Airflow adjustments between your current and new variations of the surroundings. Evaluation the Apache Airflow launch notes to know the affect of recent options, vital adjustments, and bug fixes that each one intermediate Apache Airflow releases made between your supply and vacation spot variations.
  2. Evaluation your current necessities.txt file to confirm the right set of dependencies required to your goal surroundings. Moreover, confirm that your necessities.txt file has the right constraints file added on the prime of the file to match your goal surroundings. The Apache Airflow constraints file specifies the dependent modules and supplier variations accessible on the time of an Apache Airflow launch. Including a constraints file prevents incompatible libraries from being put in to your surroundings. Within the following instance, substitute {Airflow-version} together with your goal surroundings’s model quantity, and {Python-version} with the model of Python that’s appropriate together with your surroundings: --constraint "https://uncooked.githubusercontent.com/apache/airflow/constraints-{Airflow-version}/constraints-{Python-version}.txt"
  3. Evaluation the compatibility of extra Python libraries talked about in your necessities.txt file to match your goal surroundings. Apache Airflow v2.4.3 and above use Python v3.10, whereas older Apache Airflow variations use Python v3.7. Due to this fact, if you’re attempting to improve your current Apache Airflow v2.0.2/2.2.2-based surroundings to Apache Airflow v2.4.3 or larger, you must replace your extra Python libraries to match Python v3.10.
  4. With Apache Airflow v2.4.3 and above, the listing of supplier packages Amazon MWAA installs by default to your surroundings has modified. Observe that some imports and operator names have modified within the new supplier package deal in Apache Airflow to be able to standardize the naming conference throughout the supplier packages. Evaluate the listing of supplier packages put in by default in Apache Airflow v2.2.2 or v2.0.2, and configure any extra packages you would possibly want to your new Apache Airflow v2.4.3 and better surroundings.
  5. Make it possible for your DAGs and different workflow assets are appropriate with the brand new Apache Airflow model you might be upgrading to.
  6. Use the aws-mwaa-local-runner utility to check out your current DAGs, necessities, plugins, and dependencies regionally earlier than deploying to Amazon MWAA. You possibly can create a goal Apache Airflow surroundings that’s just like an Amazon MWAA manufacturing picture regionally utilizing aws-mwaa-local-runner and confirm all of your elements work earlier than making an attempt to improve your Amazon MWAA surroundings. Moreover, take a look at the brand new surroundings improve course of in decrease Amazon MWAA environments like dev or staging earlier than rolling out the improve in manufacturing environments.

Improve course of

When an improve has been initiated, Amazon MWAA stops the prevailing underlying Apache Airflow elements (net server, scheduler, and staff). This course of halts any employee duties which might be at present operating. The standing of your surroundings at this stage will present as UPDATING. The improve course of then creates a database snapshot of the metadata database, marked by the standing CREATING_SNAPSHOT. When the snapshot is full, the surroundings standing returns to UPDATING as Amazon MWAA triggers the creation of a brand new Apache Airflow surroundings that matches your model choice and applies the required schema adjustments to the prevailing metadata database to align it with the goal Apache Airflow surroundings. Throughout this section, your specified necessities, plugins, and different dependencies are put in.

Upon completion, your new surroundings is marked as AVAILABLE, indicating that the improve course of has been profitable and the surroundings is prepared for testing. Now you can log in to your Apache Airflow UI to confirm the presence of your current DAGs, their historic runs, configured connections, and extra.

Nevertheless, if there are failures in putting in your specified necessities, plugins, and dependencies information, the surroundings initiates a rollback to the earlier steady model. Throughout this course of, your surroundings standing will present as ROLLING_BACK. If the rollback is profitable, your earlier steady surroundings shall be accessible and the standing will show as UPDATE_FAILED till a brand new replace is tried and succeeds. If the rollback fails, the standing will present as UNAVAILABLE, indicating that your surroundings is just not purposeful.

In case your surroundings improve course of fails, it’s possible that the underlying Amazon Elastic Container Service (Amazon ECS) AWS Fargate clusters had stabilization points attributable to conflicting necessities and plugins, networking points, or DB migration points after the Apache Airflow part improve. To mitigate these points, be certain that your DAGs and necessities work with out points utilizing the aws-mwaa-local-runner utility and, ideally, take a look at in a staging Amazon MWAA surroundings.

Extra concerns

Consider the next extra info of this function:

  • The improve course of is offered on demand, and shall be restricted to transferring to newer variations. In-place model upgrades on Amazon MWAA will not be supported for model 1.10.z. To carry out a serious model improve, for instance from model 1.y.z to 2.y.z, you have to create a brand new surroundings and migrate your assets.
  • You possibly can solely choose relevant larger variations that you would be able to improve to. Downgrading to a decrease model is just not accessible.
  • The rollback course of can take extra time and, if in case you have Amazon Easy Storage Service (Amazon S3) bucket versioning enabled, Amazon MWAA is designed to revert the surroundings to the earlier working configuration, together with plugins and necessities. Nevertheless, any handbook adjustments made to your DAGs is not going to be reverted throughout this course of.
  • After the improve course of has accomplished efficiently and the surroundings is offered, any operating DAGs that have been interrupted in the course of the improve are scheduled for a retry, relying on the best way you configure retries to your DAGs. You can even set off them manually or watch for the following scheduled run.
  • You must iteratively improve your environments beginning with the least crucial ones first.

Conclusion

On this submit, we talked in regards to the new function of Amazon MWAA that means that you can improve your current Amazon MWAA surroundings to larger Apache Airflow variations. This function is supported on new and current Amazon MWAA environments operating Apache Airflow 2.x and above. Use this function to improve your Apache Airflow variations whereas retaining your current workflow run histories and surroundings configurations. By upgrading, you possibly can benefit from the newest capabilities of the Apache Airflow platform and preserve compatibility with new options and cling to finest practices.

For added particulars and code examples on Amazon MWAA, go to the Amazon MWAA Consumer Information and the Amazon MWAA examples GitHub repo.

Apache, Apache Airflow, and Airflow are both registered emblems or emblems of the Apache Software program Basis in the US and/or different nations.


In regards to the Authors

Parnab Basak is a Options Architect and a Serverless Specialist at AWS. He makes a speciality of creating new options which might be cloud native utilizing fashionable software program growth practices like serverless, DevOps, and analytics. Parnab works intently within the analytics and integration providers area serving to clients undertake AWS providers for his or her workflow orchestration wants.

Fernando Gamero is a Senior Options Architect engineer at AWS, having greater than 25 years of expertise within the expertise trade, from telecommunications, banking to startups. He’s now serving to clients with constructing Occasion Pushed Architectures, adopting IoT options on the Edge, and remodeling their knowledge and machine studying pipelines at scale.

Shubham Mehta is an skilled product supervisor with over eight years of expertise and a confirmed observe document of delivering profitable merchandise. In his present function as a Senior Product Supervisor at AWS, he oversees Amazon Managed Workflows for Apache Airflow (Amazon MWAA) and spearheads the Apache Airflow open-source contributions to additional improve the product’s performance.



Supply hyperlink

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments