Transaction Assist in Cloudera Operational Database (COD)

January 11, 2023

1

Posted in Technical |
November 30, 2022 6 min learn

What’s CDP Operational Database (COD)

CDP Operational Database allows builders to rapidly construct future-proof purposes which might be architected to deal with knowledge evolution. It helps builders automate and simplify database administration with capabilities like auto-scale, and is totally built-in with Cloudera Knowledge Platform (CDP). For extra data and to get began with COD, discuss with Getting Began with Cloudera Knowledge Platform Operational Database (COD).

Background

We’ve got divided the “Transaction Assist in Cloudera Operational Database (COD)” weblog into two components.

On this first publish, we’re masking the overview and usages of transaction help in COD.
Within the second, we’re demonstrating easy methods to use transactions in your COD atmosphere with a step-by-step instance. See easy methods to use transactions in COD.

An outline of transaction help in COD

Transactions are a sequence of a number of modifications in a database that have to be accomplished in an order, or canceled to make sure integrity and consistency.

The transaction help in COD allows you to carry out advanced distributed transactions and run atomic cross-row and cross-table database operations. The atomic database operations make sure that your database operations should both be accomplished or terminated.

COD helps Apache OMID (Optimistically Transaction Administration In Datastores) transactional framework that enables massive knowledge purposes to execute ACID transactions—adhering to the ACID properties of atomicity, consistency, isolation and sturdiness—on COD tables. OMID gives lock-free transactional help on prime of HBase with snapshot isolation assure. OMID allows massive knowledge purposes to profit from one of the best of each worlds: the scalability offered by NoSQL datastores akin to HBase, and the concurrency and atomicity offered by transaction processing programs.

How COD manages transactions

When a number of transactions are occurring concurrently at completely different terminals, COD ensures both the HBase tables are up to date for every transaction finish to finish, marking the transaction as accomplished, or terminates the transaction and HBase tables usually are not up to date. COD archives this transaction administration utilizing OMID, which is a transactional processing service, together with HBase and Phoenix.

COD additionally manages related configurations required to help transactions in an effort to use transactions in purposes with none further effort.

Figure1: OMID shopper view

COD robotically performs all of the steps to handle Phoenix transactions. These steps are described in Annexure1.

Learn how to use transactions with completely different purposes

You should utilize COD transactions in streaming purposes or OLTP (On-line Transaction Processing) purposes in addition to batch-oriented Spark purposes.

For extra particulars on deploying transaction help on COD, see Learn how to Use Transactions on COD.

The next are the alternative ways and eventualities via which you should utilize COD transactions.

1 (a): Phoenix thick shopper and skinny shopper (utilizing SQLLine command line) :

// create transactional desk

0: jdbc:phoenix:> create desk bankaccount(customer_id varchar main key,identify varchar, stability double) transactional=true;

No rows affected (2.287 seconds)

// Preliminary knowledge inhabitants

0: jdbc:phoenix:> upsert into bankaccount values('CU001', 'foo', 100.0);

1 row affected (0.017 seconds)

0: jdbc:phoenix:> upsert into bankaccount values('CU002', ' baa', 100.0);

1 row affected (0.015 seconds)

0: jdbc:phoenix:> choose * from bankaccount;

+-------------+------+---------+

| CUSTOMER_ID | NAME | BALANCE |

+-------------+------+---------+

| CU001       | foo  | 100.0   |

| CU002       |  baa | 100.0   |

+-------------+------+---------+

// Auto commit off

0: jdbc:phoenix:> !autocommit off

Autocommit standing: false

// Begins transaction 1 to switch 50 from CU001 to CU002

0: jdbc:phoenix:> upsert into bankaccount(customer_id, stability) choose customer_id, stability - 50 from bankaccount the place customer_id = 'CU001';

1 row affected (0.075 seconds)

0: jdbc:phoenix:> upsert into bankaccount(customer_id, stability) choose customer_id, stability + 50 from bankaccount the place customer_id = 'CU002';

1 row affected (0.021 seconds)

0: jdbc:phoenix:> !commit

Commit full (0.044 seconds)

0: jdbc:phoenix:> choose * from bankaccount;

+-------------+------+---------+

| CUSTOMER_ID | NAME | BALANCE |

+-------------+------+---------+

| CU001       | foo  | 50.0    |

| CU002       |  baa | 150.0   |

+-------------+------+---------+

2 rows chosen (0.068 seconds)

// Begins transaction 2  to switch 20 from CU001 to CU002

0: jdbc:phoenix:> upsert into bankaccount(customer_id, stability) choose customer_id, stability - 20 from bankaccount the place customer_id = 'CU001';

1 row affected (0.014 seconds)

0: jdbc:phoenix:> upsert into bankaccount(customer_id, stability) choose customer_id, stability + 20 from bankaccount the place customer_id = 'CU002';

1 row affected (0.349 seconds)

// Rollback the modifications

0: jdbc:phoenix:> !rollback 

Rollback full (0.007 seconds)

// Ought to get the identical consequence as above after rollback.

0: jdbc:phoenix:> choose * from bankaccount;

+-------------+------+---------+

| CUSTOMER_ID | NAME | BALANCE |

+-------------+------+---------+

| CU001       | foo  | 50.0    |

| CU002       |  baa | 150.0   |

+-------------+------+---------+

2 rows chosen (0.038 seconds)

1 (b): Phoenix thick and skinny shopper (utilizing Java software):

strive (Connection conn = DriverManager.getConnection(jdbcUrl)) {

 Assertion stmt = conn.createStatement();

 stmt.execute("CREATE TABLE IF NOT EXISTS ITEM " +

     " (id varchar not null main key, identify varchar, amount integer) transactional=true");

 conn.setAutoCommit(false);

 stmt = conn.createStatement();

 stmt.execute("UPSERT INTO ITEM VALUES('ITM001','E-book', 5)");

 stmt.execute("UPSERT INTO ITEM VALUES('ITM002','Pen', 5)");

 conn.commit();

 stmt.execute("UPSERT INTO ITEM VALUES('ITM003','Cleaning soap', 5)");

 conn.rollback();

 ResultSet rs = stmt.executeQuery("SELECT rely(*) FROM ITEM");

 // The variety of rows needs to be two.

 System.out.println("Variety of rows " + rs.subsequent());

}

With exception dealing with

strive (Connection conn = DriverManager.getConnection(jdbcUrl)) {

   strive {

       Assertion stmt = conn.createStatement();

       stmt.execute("CREATE TABLE IF NOT EXISTS ITEM " +

               " (id varchar not null main key, identify varchar, amount integer) transactional=true");

       conn.setAutoCommit(false);

       stmt = conn.createStatement();

       stmt.execute("UPSERT INTO ITEM VALUES('ITM001','E-book', 5)");

       stmt.execute("UPSERT INTO ITEM VALUES('ITM002','Pen', 5)");

       conn.commit();

   } catch (SQLException e) {

       LOG.error("Error occurred whereas performing transaction:", e);

       conn.rollback();

       // dealing with the exception object

       throw new RuntimeException(e);

   }

}

2: Phoenix spark software: you should utilize Phoenix-Spark connector transactions to retry the Spark duties if there are any conflicts with different jobs or streaming purposes.

COD helps the next two kinds of transactions whereas writing into the tables.

Batch clever transactions: Set phoenix.upsert.batch.measurement to any constructive integer worth to create transactions for a batch of a specific variety of rows.
Partition clever transactions: Set phoenix.upsert.batch.measurement to 0 to create one transaction per process.

Git hyperlink for the instance code: https://github.com/cloudera/cod-examples/tree/fundamental/phoenix-spark-transactions

val tableName: String = "SPARK_TEST"

val conn = DriverManager.getConnection(url)

var stmt = conn.createStatement();

stmt.execute("CREATE TABLE SPARK_TEST

 " (ID INTEGER PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TRANSACTIONAL=true" +

 " SPLIT ON (200, 400, 600, 800, 1000)")

val spark = SparkSession

 .builder()

 .appName("phoenix-test")

 .grasp("native")

 .getOrCreate()

val schema = StructType(

 Seq(StructField("ID", IntegerType, nullable = false),

   StructField("COL1", StringType),

   StructField("COL2", IntegerType)))

// Write rows from 1 to 500.

var dataSet = Checklist(Row(1, "1", 1), Row(2, "2", 2))

for (w <- 3 to 500) {

 dataSet = dataSet :+ Row(w, "foo", w);

}

var rowRDD = spark.sparkContext.parallelize(dataSet)

var df = spark.sqlContext.createDataFrame(rowRDD, schema)

// Batch clever transactions:

// ========================

// Setting batch measurement to 100. For every batch of 100 information one transaction will get created.

var extraOptions = "phoenix.transactions.enabled=true,phoenix.upsert.batch.measurement=100";

df.write

 .format("phoenix")

 .choices(Map("desk" -> tableName, PhoenixDataSource.ZOOKEEPER_URL -> zkUrl,

   PhoenixDataSource.PHOENIX_CONFIGS -> extraOptions))

 .mode(SaveMode.Overwrite)

 .save()

// Write rows from 500 to 1000.

dataSet = Checklist(Row(501, "500", 500), Row(502, "502", 502))

for (w <- 503 to 1000) {

 dataSet = dataSet :+ Row(w, ""+w, w);

}

// Partition clever transactions:

// ===========================

// Setting batch measurement 0 means for partition one transaction will get created.

rowRDD = spark.sparkContext.parallelize(dataSet)

df = spark.sqlContext.createDataFrame(rowRDD, schema)

extraOptions = "phoenix.transactions.enabled=true,phoenix.upsert.batch.measurement=0";

df.write

 .format("phoenix")

 .choices(Map("desk" -> tableName, PhoenixDataSource.ZOOKEEPER_URL -> zkUrl,

   PhoenixDataSource.PHOENIX_CONFIGS -> extraOptions))

 .mode(SaveMode.Overwrite)

 .save()

Learn how to use transactions with completely different instruments

The principle operations that you simply use whereas accessing COD transactions are auto commit on/off, commit, and rollback. These operations are carried out in several methods with completely different instruments.

On this part, you’ll find a hyperlink to a well-liked SQL improvement software like DbVisualizer and an instance snippet.

DbVisualizer:

https://confluence.dbvis.com/show/UG100/Auto+Commitpercent2C+Commit+and+Rollback

Abstract

On this weblog publish, we’ve mentioned how COD manages transactions occurring at a number of terminals utilizing OMID. We’ve got additionally included numerous eventualities the place you embrace COD transactions and an end-to-end stream describing how one can implement transactions in an actual time state of affairs.

So, are you able to check out COD transactions help? Right here’s your first step on making a database utilizing COD.

Annexure

Annexure1:

Step 1: The next property in HBase UI > Configurations tab is about as ‘true’.

phoenix.transactions.enabled=true

Step 2: COD generates OMID shopper configuration file, hbase-omid-client-config.yml, that accommodates the transaction server tackle.

You should utilize the next command to obtain the shopper configuration file and use the configuration in software classpath together with hbase-site.xml.