System Redundancy and Node Fault Tolerance

Awanio is designed with redundancy at every architectural layer, ensuring that failure in any single component does not cause downtime or degrade performance.

Redundancy Features

  • Multi-level Redundancy
    Redundancy is implemented across management servers, network, storage, and compute nodes.

  • Automatic Failover
    Workloads (VMs, containers, services) are automatically migrated to healthy nodes with no manual intervention.

  • Node Redundancy (At Least 1 Node Failure)
    Each cluster is designed to continue normal operations even when one node becomes unavailable.

  • Dynamic Load Redistribution
    Load from a failed node is automatically balanced across active nodes.


Benefits

  • Guarantees high availability and service continuity
  • Eliminates single points of failure
  • Maintains consistent performance during hardware faults

Technical Architecture

Awanio stores data using a replication factor (RF) of 3 (default), meaning each data block is stored on three different nodes. This ensures that data remains available even if one or two nodes fail. The Awanio management layer continuously monitors node health and workload distribution.

In Awanio’s cluster architecture, which uses a minimum of 3 nodes according to the replication factor, each node serves as both compute and storage. Virtual Machines (VMs) run on top of the Awanio virtualization layer, while all VM data is stored in a Ceph Distributed Storage pool.


Failover Simulation – 3 Node Cluster

Node DownNode ActiveCeph StatusVM StatusService StatusDescription
03RF=3NormalNormalIdeal
12RF=2Auto failoverStill runningSafe
21RF=1UnstableRisk of downCRITICAL
30--DOWNTotal failure