Business continuity and disaster recovery
Overview
This guide will cover how to implement a robust Business Continuity and Disaster Recovery (BCDR) within Rayls, mitigating risks and ensuring continuous operation in the face of unforeseen events.
This section outlines strategies for maintaining high availability and resilience across both the Private Subnet and Privacy Ledger infrastructures, with a focus on disaster recovery, containerized deployments using Docker Compose and Kubernetes, and multi-region setups.
Rayls Components
We recommend all Rayls components to be deployed inside containers. This feature enables a few BCDR strategies that can accommodate different Companies BCP policies and requirements.
Containers provide numerous benefits for disaster recovery capabilities, including isolation, portability, and fast deployment. By encapsulating applications and dependencies, containers ensure consistency and compatibility across different environments, facilitating rapid deployment and recovery processes.
Container orchestration platforms further enhance disaster recovery by automating tasks such as failover and load balancing, while versioning and rollback capabilities enable organizations to revert to known good states quickly and reliably.
Overall, containers offer a flexible, efficient, and reliable platform for implementing disaster recovery strategies, enabling organizations to maintain business continuity and minimize downtime in the face of disasters.
Major cloud providers offer managed Kubernetes services, which are available in multiple availability zones. Leveraging these services, Rayls components can be distributed across availability zones through Kubernetes orchestration.
Reach out to us for full access to our Rayls Installation Package.
Third Party Components
MongoDB
For customers opting to use the cloud-based MongoDB service, MongoDB Atlas, a replica set is deployed by default with at least three nodes, distributed across Availability Zones (AZs) provided by the Cloud Provider. Additionally, utilizing MongoDB Atlas enables the establishment of a Multi-Cloud and Multi-Region Disaster Recovery (DR) architecture.
For customers preferring to install and manage their own MongoDB cluster, various resilient architectures are available. We recommend referring to the MongoDB official documentation for detailed guidance. As a starting point, configuring a MongoDB Replica Set with five nodes spread across multiple Availability Zones can provide a resilient architecture suitable for critical missions.
Read more in MongoDB Documentation
Commit Chain
The Rayls Private Subnet supports out-of-the-box any EVM-based blockchain to fulfill the commit-chain role. In the current version we recommend using Hyperledger Besu. While the architecture of Hyperledger Besu is beyond the scope of this document and can be explored in the official Hyperledger Besu documentation, it's important to note that a minimum of four nodes is required for the QBFT consensus protocol. Therefore, a recommended approach is to distribute these Besu nodes across different Availability Zones, regions, or Cloud Providers for enhanced resilience and fault tolerance.
“In QBFT networks, approved accounts, known as validators, validate transactions and blocks. Validators take turns to create the next block. Before inserting the block onto the chain, a super-majority (greater than or equal to 2/3) of validators must first sign the block.”
For production and mission-critical applications, deploying five Besu validators distributed across multiple Availability Zones, regions, and/or cloud providers is also highly recommended.
Read more in Hyperledger Besu Documentation.
PostgreSQL
PostgreSQL offers features for creating multiple Read Replicas, which can be strategically spread across multiple Availability Zones, regions, or cloud providers to enhance fault tolerance and scalability. Furthermore, leading cloud providers offer advanced solutions, such as Amazon Aurora, which go beyond the standard PostgreSQL features. For instance, Amazon Aurora allows the deployment of multiple write instances and read replicas across different regions, providing even greater resilience and performance.
“Amazon Aurora Multi-Master is now generally available, allowing you to create multiple read-write instances of your Aurora database across multiple Availability Zones, which enables uptime-sensitive applications to achieve continuous write availability through instance failure. In the event of instance or Availability Zone failures, Aurora Multi-Master enables the Aurora database to maintain read and write availability with zero application downtime. With Aurora Multi-Master, there is no need for database failovers to resume write operations.”
See more:
Multi-Region and Disaster Recovery Strategy
To ensure the highest level of availability and disaster tolerance, all key components should be deployed across multiple regions or availability zones, regardless of whether you use Docker Compose or Kubernetes.
- Data Replication: Databases like MongoDB should be configured to replicate data across multiple regions, minimizing downtime and ensuring data integrity in case of regional failures.
- Failover Automation: In Kubernetes, failover is automatically handled by routing traffic to healthy nodes in another region. With Docker Compose, failover may need to be manually configured, but it can still be achieved through load balancers or cloud services.
- Cross-Region Load Balancing: Load balancing across regions ensures optimal performance and redundancy, keeping the infrastructure operational during high-traffic periods or failures.
Updated about 2 months ago