This weblog put up is co-written with Govind Mohan and Kausik Dhar from Cognizant.
Migrating on-premises knowledge warehouses to the cloud is not considered as an possibility however a necessity for corporations to avoid wasting value and make the most of what the most recent expertise has to supply. Though now we have seen lots of focus towards migrating knowledge from legacy knowledge warehouses to the cloud and a number of instruments to help this initiative, knowledge is simply a part of the journey. Profitable migration of legacy extract, rework, and cargo (ETL) processes that purchase, enrich, and rework the info performs a key position within the success of any end-to-end knowledge warehouse migration to the cloud.
The standard method of manually rewriting a lot of ETL processes to cloud-native applied sciences like AWS Glue is time consuming and might be vulnerable to human error. Cognizant Knowledge & Intelligence Toolkit (CDIT) – ETL Conversion Instrument automates this course of, bringing in additional predictability and accuracy, eliminating the chance related to handbook conversion, and offering quicker time to marketplace for prospects.
Cognizant is an AWS Premier Tier Providers Associate with a number of AWS Competencies. With its industry-based, consultative method, Cognizant helps purchasers envision, construct, and run extra progressive and environment friendly companies.
On this put up, we describe how Cognizant’s Knowledge & Intelligence Toolkit (CDIT)- ETL Conversion Instrument may help you routinely convert legacy ETL code to AWS Glue shortly and successfully. We additionally describe the primary steps concerned, the supported options, and their advantages.
Answer overview
Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument automates conversion of ETL pipelines and orchestration code from legacy instruments to AWS Glue and AWS Step Features and eliminates the handbook processes concerned in a buyer’s ETL cloud migration journey.
It comes with an intuitive consumer interface (UI). You should utilize these accelerators by choosing the supply and goal ETL device for conversion after which importing an XML file of the ETL mapping to be transformed as enter.
The device additionally helps steady monitoring of the general progress, and alerting mechanisms are in place within the occasion of any failures, errors, or operational points.
Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument internally makes use of many native AWS providers, akin to Amazon Easy Storage Service (Amazon S3) and Amazon Relational Database Service (Amazon RDS) for storage and metadata administration; Amazon Elastic Compute Cloud (Amazon EC2) and AWS Lambda for processing; Amazon CloudWatch, AWS Key Administration Service (AWS KMS), and AWS IAM Identification Middle (successor to AWS Single Signal-On) for monitoring and safety; and AWS CloudFormation for infrastructure administration. The next diagram illustrates this structure.
Learn how to use CDIT: ETL Conversion Instrument for ETL migration.
Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument helps the next legacy ETL instruments as supply and helps producing corresponding AWS Glue ETL scripts in each Python and Scala:
- Informatica
- DataStage
- SSIS
- Talend
Let’s have a look at the migration steps in additional element.
Assess the legacy ETL course of
Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument lets you assess in bulk the potential automation share and complexity of a set of ETL jobs and workflows which are in scope for migration to AWS Glue. The evaluation possibility helps you perceive what sort of saving might be achieved utilizing Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument, the complexity of the ETL mappings, and the extent of handbook conversion wanted, if any. You possibly can add a single ETL mapping or a folder containing a number of ETL mappings as enter for evaluation and generate an evaluation report, as proven within the following determine.
Convert the ETL code to AWS Glue
To transform legacy ETL code, you add the XML file of the ETL mapping as enter to the device. Consumer inputs are saved within the inner metadata repository of the device and Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument parses these XML enter information and breaks them right down to a patented canonical mannequin, which is then ahead engineered into the goal AWS Glue scripts in Python or Scala. The next screenshot reveals an instance of the Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument GUI and Output Console pane.
If any a part of the enter ETL job couldn’t be transformed utterly to the equal AWS Glue script, it’s tagged between remark strains within the output in order that it may be manually fastened.
Convert the workflow to Step Features
The following logical step after changing the legacy ETL jobs is to orchestrate the run of those jobs within the logical order. Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument enables you to automate the conversion of on-premises ETL workflows by changing them to corresponding Step Features workflows. The next determine illustrates a pattern enter Informatica workflow.
Workflow conversion follows the same sample as that of the ETL mapping. XML information for ETL workflows are uploaded as enter and Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument it generates the equal Step Features JSON file based mostly on the enter XML file knowledge.
Advantages of utilizing Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument
The next are the important thing advantages of utilizing Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument to automate legacy ETL conversion:
- Value discount – You possibly can cut back the general migration effort by as a lot as 80% by automating the conversion of ETL and workflows to AWS Glue and Step Features
- Higher planning and implementation – You possibly can assess the ETL scope and decide automation share, complexity, and unsupported patterns earlier than the beginning of the mission, leading to correct estimation and timelines
- Completeness – Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument affords one resolution with help for a number of legacy ETL instruments like Informatica, DataStage, Talend, and extra.
- Improved buyer expertise – You possibly can obtain migration targets seamlessly with out errors attributable to handbook conversion and with excessive automation share
Case examine: Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument proposed implementation
A big US-based insurance coverage and annuities firm needed emigrate their legacy ETL course of in Informatica to AWS Glue as a part of their cloud migration technique.
As a part of this engagement, Cognizant helped the client efficiently migrate their Informatica based mostly knowledge acquisition and integration ETL jobs and workflows to AWS. A proof of idea (PoC) utilizing Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument was accomplished first to showcase and validate automation capabilities.
Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument was used to automate the conversion of over 300 Informatica mappings and workflows to equal AWS Glue jobs and Step Features workflows, respectively. Because of this, the client was in a position to migrate all legacy ETL code to AWS as deliberate and retire the legacy software.
The next are key highlights from this engagement:
- Migration of over 300 legacy Informatica ETL jobs to AWS Glue
- Automated conversion of over 6,000 transformations from legacy ETL to AWS Glue
- 85% automation achieved utilizing CDIT: ETL Conversion Instrument
- The client saved licensing charges and retired their legacy software as deliberate
Conclusion
On this put up, we mentioned how migrating legacy ETL processes to the cloud is vital to the success of a cloud migration journey. Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument lets you carry out an evaluation of the present ETL course of to derive complexity and automation share for higher estimation and planning. We additionally mentioned the ETL applied sciences supported by Cognizant Knowledge & Intelligence Toolkit (CDIT): ETL Conversion Instrument and the way ETL jobs might be transformed to corresponding AWS Glue scripts. Lastly, we demonstrated use current ETL workflows to routinely generate corresponding Step Features orchestration jobs.
To be taught extra, please attain out to Cognizant.
Concerning the Authors
Deepak Singh is a Senior Options Architect at Amazon Net Providers with 20+ years of expertise in Knowledge & AIA. He enjoys working with AWS companions and prospects on constructing scalable analytical options for his or her enterprise outcomes. When not at work, he loves spending time with household or exploring new applied sciences in analytics and AI area.
Piyush Patra is a Associate Options Architect at Amazon Net Providers the place he helps companions with their Analytics journeys and is the worldwide lead for strategic Knowledge Property Modernization and Migration companion applications.
Govind Mohan is an Affiliate Director with Cognizant with over 18 12 months of expertise in knowledge and analytics area, he has helped design and implement a number of large-scale knowledge migration, software raise & shift and legacy modernization tasks and works carefully with prospects in accelerating the cloud modernization journey leveraging Cognizant Knowledge and Intelligence Toolkit (CDIT) platform.
Kausik Dhar is a expertise chief having greater than 23 years of IT expertise – primarily centered on Knowledge & Analytics, Knowledge Modernization, Software Growth, Supply Administration, and Answer Structure. He has performed a pivotal position in guiding purchasers by the designing and executing large-scale knowledge and course of migrations, along with spearheading profitable cloud implementations. Kausik possesses experience in formulating migration methods for advanced applications and adeptly establishing knowledge lake/Lakehouse structure using a wide selection of instruments and applied sciences.