Reset-Driven Fault Tolerance
João Carlos Cunha6, 7
, Antńio Correia7
, Jorge Henriques7
, Mário Zenha Rela7
and João Gabriel Silva7 
| (6) |
Dep. Eng. Informática e Sistemas, Instituto Superior de Engenharia de Coimbra, 3030 Coimbra, Portugal |
| (7) |
CISUC/Dep. Eng. Informática, Universidade de Coimbra, 3030 Coimbra, Portugal |
Abstract
A common approach in embedded systems to achieve faulttolerance is to reboot the computer whenever some non-permanent error
is detected. All the system code and data are recreated from scratch, and a previously established checkpoint, hopefully not
corrupted, is used to restart the application data. The confidence is thus restored on the activity of the computer. The idea
explored in this paper is that of unconditionally resetting the computer in each control frame (the classic read sensors →
calculate control action → update actuators cycle). A stable-storage based in RAM is used to preserve the system’s state between
consecutive cleanups and a standard watchdog timer guarantees that a reset is forced whenever an error crashes the system.
We have evaluated this approach by using fault-injection in the controller of a standard temperature control system. The experimental
observations show that the Reset-Driven Fault Tolerance is a very simple yet effective technique to improve reliability at
an extremely low cost since it is a conceptually simple, software only solution with the advantage of being application independent.
This work was partially supported by the Portuguese Foundation for Science and Technology under the POSI programme and the
FEDER programme of the European Union, through the R&D Unit 326/94 (CISUC) and the project PRAXIS/P/EEI/10205/1998 (CRON).
References secured to subscribers.