System Redundancy and Node Fault Tolerance
Awanio is designed with redundancy at every architectural layer, ensuring that failure in any single component does not cause downtime or degrade performance.
Redundancy Features
-
Multi-level Redundancy
Redundancy is implemented across management servers, network, storage, and compute nodes. -
Automatic Failover
Workloads (VMs, containers, services) are automatically migrated to healthy nodes with no manual intervention. -
Node Redundancy (At Least 1 Node Failure)
Each cluster is designed to continue normal operations even when one node becomes unavailable. -
Dynamic Load Redistribution
Load from a failed node is automatically balanced across active nodes.
Benefits
- Guarantees high availability and service continuity
- Eliminates single points of failure
- Maintains consistent performance during hardware faults
Technical Architecture
Awanio stores data using a replication factor (RF) of 3 (default), meaning each data block is stored on three different nodes. This ensures that data remains available even if one or two nodes fail. The Awanio management layer continuously monitors node health and workload distribution.
In Awanio’s cluster architecture, which uses a minimum of 3 nodes according to the replication factor, each node serves as both compute and storage. Virtual Machines (VMs) run on top of the Awanio virtualization layer, while all VM data is stored in a Ceph Distributed Storage pool.
Failover Simulation – 3 Node Cluster
| Node Down | Node Active | Ceph Status | VM Status | Service Status | Description |
|---|---|---|---|---|---|
| 0 | 3 | RF=3 | Normal | Normal | Ideal |
| 1 | 2 | RF=2 | Auto failover | Still running | Safe |
| 2 | 1 | RF=1 | Unstable | Risk of down | CRITICAL |
| 3 | 0 | - | - | DOWN | Total failure |