Standby vs. Spare parts: an example of integrated reliability and maintenance design
Introduction:
Asset availability depends both on the sub-systems’ reliability and on down-time due to failures. The problem is that system reliability and down-time belong to different disciplines: component reliability is under the responsibility of the reliability engineer while down-time is an issue addressed by maintenance and operations engineers.
A striking example of the inter-connection between reliability and maintenance is the choice between designing a system with standby redundancy, and replacing the redundancy with a spare parts maintenance policy. The similarities and differences between the two approaches will be explored in this paper in terms of availability and cost.
Example – pump:
We being with a simple example: Consider an oil pump with a mean time between failure (MTBF) of 3 years (26,280 hours). When the pump fails the mean time to repair (MTTR) is one week (168 hours). The pump availability is therefore: 99.365%. This means that the pump is unavailable on average 2.3 days each year. In order to improve the situation one can either design the system with a second pump on standby, or put a second pump as a spare part nearby.
Standby scenario:
Initially the main pump works and the backup pump is not working (cold standby). When the operating pump fails the backup pump immediately replaces it. The failed pump is sent to the repair shop (hot repair). If the repair process finishes before the backup pump fails, the system goes back to the initial state, otherwise a system failure occurs until one of the pumps is repaired.
The scenario described above can be modeled as a renewal process for which a simple Markov chain diagram is given in Figure 1: