Once we speak with prospects, we hear that they need to have the ability to harness insights from knowledge to be able to make well timed, impactful, and actionable enterprise choices. A typical sample with data-driven organizations is that they’ve many different knowledge sources they should ingest into their analytics techniques. This requires them to construct handbook knowledge pipelines spanning throughout their operational databases, knowledge lakes, streaming knowledge, and knowledge inside their warehouse. As a consequence of this advanced setup, it might probably take knowledge engineers weeks and even months to construct knowledge ingestion pipelines. These knowledge pipelines are pricey, and the delays can result in missed enterprise alternatives. Moreover, knowledge warehouses are more and more changing into mission vital techniques that require excessive availability, reliability, and safety.
Amazon Redshift is a totally managed petabyte-scale knowledge warehouse utilized by tens of hundreds of consumers to simply, rapidly, securely, and cost-effectively analyze all their knowledge at any scale. This 12 months at re:Invent, Amazon Redshift has introduced a lot of options that will help you simplify knowledge ingestion and get to insights simply and rapidly, inside a safe, dependable surroundings.
On this weblog, I introduce a few of these new options that fit into two principal classes:
- Simplify knowledge ingestion
- Amazon Redshift now helps auto-copy from Amazon S3 (obtainable in preview). With this new functionality, Amazon Redshift mechanically hundreds the recordsdata that arrive in an Amazon Easy Storage Service (Amazon S3) location that you just specify into your knowledge warehouse. The recordsdata can use any of the codecs supported by the Amazon Redshift copy command, similar to CSV, JSON, Parquet, and Avro. On this approach, you don’t have to manually or repeatedly run copy procedures. Amazon Redshift automates file ingestion and takes care of data-loading steps below the hood.
- With Amazon Aurora zero-ETL integration with Amazon Redshift, you should use Amazon Redshift for close to real-time analytics and machine studying on petabytes of transactional knowledge saved on Amazon Aurora MySQL databases (obtainable in restricted preview). With this functionality, you may select the Amazon Aurora databases containing the info you wish to analyze with Amazon Redshift. Knowledge is then replicated into your knowledge warehouse inside seconds after transactional knowledge is written into Amazon Aurora, eliminating the necessity to construct and preserve advanced knowledge pipelines. You possibly can replicate knowledge from a number of Amazon Aurora databases into the identical Amazon Redshift occasion to run analytics throughout a number of purposes. With close to real-time entry to transactional knowledge, you may leverage Amazon Redshift’s analytics and capabilities, similar to built-in machine studying (ML), materialized views, knowledge sharing, and federated entry to a number of knowledge shops and knowledge lakes, to derive insights from transactional and different knowledge.
- With the overall availability of Amazon Redshift Streaming Ingestion, now you can natively ingest lots of of megabytes of knowledge per second from Amazon Kinesis Knowledge Streams and Amazon MSK into an Amazon Redshift materialized view and question it in seconds. Study extra in this submit.
- Make your knowledge warehouse safer and dependable
- Now you can enhance the provision of your knowledge warehouse by selecting a number of Availability Zone (AZ) deployments. Multi-AZ deployments to your Amazon Redshift clusters can be found in preview and cut back restoration occasions to seconds by computerized restoration. On this approach, you may construct options which might be extra compliant with the suggestions of the Reliability Pillar of the AWS Properly-Architected Framework.
- With dynamic knowledge masking (obtainable in preview), you may shield delicate data saved in your knowledge warehouse and be sure that solely the related knowledge is accessible by customers based mostly on their roles. You possibly can restrict how a lot identifiable knowledge is seen to customers utilizing a number of ranges of insurance policies so totally different customers and teams can have totally different ranges of knowledge entry with out having to create a number of copies of knowledge. Dynamic knowledge masking enhances different granular entry management capabilities in Amazon Redshift together with row-level and column-level safety and role-based entry controls. On this approach, Dynamic Knowledge Masking helps you meet necessities for GDPR, CCPA, and different privateness laws.
- Amazon Redshift now helps central entry controls for knowledge sharing with AWS Lake Formation (obtainable in public preview). Now you can use Lake Formation to simplify governance of knowledge shared from Amazon Redshift and centrally handle granular entry throughout all data-sharing shoppers.
There have been different fascinating information for Amazon Redshift at re:Invent you may need already heard about:
- The overall availability of Amazon Redshift integration for Apache Spark makes it simple to construct and run Spark purposes on Amazon Redshift and Redshift Serverless, opening up the info warehouse for a broader set of AWS analytics and machine studying options.
- AWS Backup now helps Amazon Redshift. AWS Backup lets you outline a central backup coverage to handle knowledge safety of your purposes and can even shield your Amazon Redshift clusters. On this approach, you have got a constant expertise when managing knowledge safety throughout all supported providers.
Availability and Pricing
Multi-AZ deployments, central entry management for knowledge sharing with AWS Lake Formation, auto-copy from Amazon S3, and dynamic knowledge masking can be found in preview in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Europe (Eire), and Europe (Stockholm).
There isn’t a further price for utilizing auto-copy from Amazon S3 and close to real-time analytics on transactional knowledge. There isn’t a further cost for dynamic knowledge masking and central entry management for knowledge sharing. For extra data, see Amazon Redshift pricing.
These new capabilities take you one step additional in analyzing all of your knowledge throughout knowledge sources with easy knowledge ingestion capabilities, whereas bettering the safety and reliability of your knowledge warehouse.
— Danilo