
Answer
Fault tolerant computer systems contain redundant hardware, software, and power supply components that create an environment that provide continuous, un-interrupted service. Fault tolerant computers contain extra memory chips, processors, and disk storage devices to back up a system and keep it running to prevent failure. They use special software routine or self checking logic built in to their circuitry to detect hardware failure and automatically switch to backup devices. Table 1 outlines some of the fault tolerant capabilities used in many computer system and networks.
Layer | Threats | Fault tolerance methods |
Applications | Environment, Hardware | Application specific redundancies and rollback to |
and software faults | previous check points | |
Systems | Outages | System isolation, data security, system integrity |
Databases | Data errors | Separation of transactions and safe updates,
complete transactions histories, backup files |
Networks | Transmission error | Reliable controllers, safe asynchrony and handshaking, alternative routing, error
detection and error correction codes |
Processes | Hardware and software
faults |
Alternative computations, rollback to checkpoints |
Files | Media errors | Replication of critical data on different media and
sites, archiving, backup, retrieval |
Processors | Hardware faults | Instruction entry, error correcting codes in memory
and processing, replication, multiple processors and memories |