Fault tolerant systems can act towards the errors across the systems in two different aspects like roll back and roll forward. When an errors has encountered across the system, the roll forward mechanism takes the current system to the required state and corrects the errors and where the roll back mechanism is quite opposite to roll forward, where the system is taken to the previous state or version and from there the error correction mechanism moves the system forward to the required state. In general there are various requirements to be set across the implementation of the fault tolerant systems and few of them are listed below . 

  • There should not be a single point of failure across the system
  • All the required failing components should be isolated to the possible failures in the system
  • Fault contaminant should be at the higher level as per the system definition
  • Various versions of the system should be available for roll back and roll forward
  • Perfect failure recovery methodologies should be implemented across the system to prevent the possible failures . 

From these requirements it is clear that, fault tolerant systems should be designed in a perfect manner to handle the possible errors across the system where the fault tolerant system is being implemented. In general these fault tolerant systems are complex in nature and the level of complexity depends on the risks being incurred on the system.

If the complexity of the fault tolerant system fails to detect the possible errors and failures across any system, it indicates that the risk of the corresponding failure is high than the expected metrics when the initial fault tolerant system is developed .  Performance of the fault tolerant systems depends not only at the hardware level and the actual implementation is done at the application level.

In general the fault tolerant systems are developed based on the principles of the redundancy, where the duplicate or upper versions of the systems are maintained at different levels and they are set back whenever there is fault across the systems being implemented. Replication and redundancy are the commonly identified techniques that were mostly used across the fault tolerant systems.

This paper is written and submitted by sai