DB Invent announces YaXAHA Cluster

March 27, 20223 minute read

Kirill Zinov

We’re excited to announce the upcoming beta release of our PostgreSQL database clustering solution. We named our project YaXAHA Cluster, and it allows turning almost any PostgreSQL database into a scalable failover cluster.

Internals

We think that by the time of release, both the transformation of an existing copy of PostgreSQL into a cluster node and the deployment of new cluster nodes, will be automated. And now a few words on YaXAHA Cluster architecture.

Our extension is installed in the PostgreSQL folder and then the server application is launched, PgBouncer is also highly desirable. After the restart, your copy of PostgreSQL basically becomes a cluster node.

Of course, in order to provide fault tolerance, redundancy must be added. Let's say our cluster will consist of three nodes, that means we need to set up two more nodes in the way described above.

Within the framework of this news, not going into much detail, we'll just briefly go over the components of the solution, and how it all works in general.

Before completing a transaction on the node on which this transaction was started, our extension sends data to the server application and waits for a response. The server application tries to repeat the same transaction on other cluster nodes, and depending on the success or error, the transaction on the initiator node either ends or rolls back.

Performance and consistency

At the moment, we have completed our internal tests for the mode in which the node that is initiating the transaction, in order to consider that transaction successful, just needs to receive confirmation of the successful transaction execution on only a part of other cluster nodes.

The minimum number of nodes required to provide desired fault tolerance can be adjusted in the settings according to the requirements of a particular system implementation. And of course, the remaining nodes will also repeat this transaction asynchronously.

That is, if your cluster has 10 nodes, you can set the minimum required number of successful transactions on other nodes, for example, to five, and then the initiator node will not wait until all cluster nodes report their success.

This speeds up response to the clients and provides fault tolerance for the cluster. Because when something goes wrong on a node, or, for example, data transfer error between the nodes has occurred, the cluster remains operational as a whole, and the problem node can catch up later or be replaced.

No dedicated master!

YaXAHA Cluster does not have a dedicated master at all. Technically, the cluster occasionally elects the master-node that makes decisions within the boundaries of the cluster, but if this node fails, the cluster will re-elect a healthy one.

Although the cluster actually elects a master node and performs re-elections from time to time, the client application can start a transaction on any node of the cluster, which is not necessarily the elected master node. Thus, the cluster does not have a single point to start a transaction, nor a single point to fail.

Smart data locks

To prevent logical data conflicts, we implemented detection of the boundaries of each transaction using the dependency analysis of the tables participating in the transaction.

Transaction boundaries are detected considering the changes being made and the dependency types of the data you are changing. And if a transaction has overlapping boundaries with another unfinished transaction, it will wait for its turn.

Thus, YaXAHA Cluster helps to prevent logical data conflicts within the boundaries of the entire cluster.

Kirill Zinov

CIO / Principal Software Engineer