Last edit: 05/02/2024
THE DOUBT: What are the main reliability data for components used in Safety Applications?
When you deal with components used in Safety Critical Systems you are familiar with the concept of the Failure Rate. In reality there are two other important parameters you need to be aware of: the MTTF or Mean Time to Failure and the B10. The latter is important when dealing with the reliability of Electromechanical Components. In this paper, we will explain all three parameters and how they are linked to each other.
If you need the reliability values of a pressure transmitter to be used in a Safety Instrumented System, you will probably look for the component safety manual and, in particular, to the different types of Failure Rates. Those are normally calculated by specialised laboratories like Exida or one of the TÜV companies.
Different is the situation if you need the reliability parameters of a pneumatic solenoid valve; in that case you would ask directly the manufacturer and rely upon the value of the B10D they provide.
But than you may ask, why is there a different approach? Why for Electronic components the data depend only upon the component itself, while for an electromechanical component, it depends not only upon the component but also upon the way the component is connected with the logic solver?
Why, for an electronic component I see all sorts of failure rates, while for the solenoid valve I have a B10D value. How to use them?
It will take few article to answer all those questions. In this one we will focus on the three mail parameters: The Failure Rate, the Mean Time to Failure and the B10D.
The Failure Rate
When dealing with a component reliability, used in a safety critical system, the main parameter is the Failure rate λ: it has units of inverse time. It is a common practice to use units of “failures per billion (109) hours”: this unit is known as FIT or Failures in Time. For example, a particular integrated circuit will experience seven failures per billion operating hours at 25°C and, thus, it has a failure rate of 7 FITs.
According to IEC 61508, there are four types of Failures:
- Safe failures;
- Dangerous failures;
- No Effect failures; and
- No Part failures
It is quite intuitive to understand what a Safe and a Dangerous failures are. Instead, a No Part Failure is failure of a component that plays no part in implementing the safety function. A No Effect Failure is the Failure of an element that plays a part in implementing the safety function but has no direct effect on the safety function.
Both no effect failures and no part failures were added in the 2010 version of IEC 61508 to prevent being able to influence the SFF calculation by considering circuitry not relevant for the reliability of the safety function: both types of failures should not be used for the SFF calculation. We will discuss about the SFF in another article. No effect failures were not mentioned in the previous edition of IEC 62061 and they are now important to understand some new aspects related to the failure of electromechanical components.
Besides being Safe or Dangerous, each failure can also be classified as Detected or Undetected:
Therefore, the Failure Rate is the sum of five elements:
- λSD: Safe detected failure rate
- λSU: Safe undetected failure rate
- λDD: Dangerous detected failure rate
- λDU: Dangerous undetected failure rate
- λNE: No effect failure rate
- λT: Total failure rate
The Bathtub Curve
In general, any component displays a Failure rate that can be represented with a graph having the shape of a bathtub. The one shown in the figure is typical for electronic components:
In the initial phase of the component lifetime, λ(t) decreases rapidly with time; this is also called early mortality rate.
In the period called useful life, λ(t) is constant.
The last period is characterized by wear out, with a rapidly increasing failure rate λ(t).
During the useful life of a component with a constant failure rate, considering as an initial condition that Reliability at time 0 is at a maximum and it is equal to 1, we have:
For electromechanical components, like Electrical Power Contactors, the curve is shown here at the side.
Even after the early failure rate period, the failure rate is never constant, but it increases over time.
That is a problem for the Functional Safety Standards like IEC 61508 series. That is the reason why, for those components, an approximation is made for a maximum time T10D. In general, for electromechanical components a so called “surrogated failure rate” is calculated so that it is a constant value but only for the time period T10D.
That is an important step for the functional safety standards used in Machinery: ISO 13849-1 and IEC 62061. In particular, that allows the use of Markov chain or Reliability Block Diagram models to calculate the reliability of a Safety Function. That is not so important for the IEC 61511-1, since it is mainly focussed on low demand and electronic components.
Example of Failure Rates for Electronic and Electromechanical Components
For electronic components the failure rate depends upon the component itself: the way it is designed and engineered and its ability to detect internal failures. That is assessed by specialised laboratories that analyse both the field return for the specific component and the way the component is produced.
Hereafter an example of the failure rate of a pressure transmitter
Pressure Transmitter – failure rate data (Source: Exida SERH 2015 – 01 sensors – item 1.6.2) |
Per 109 hours (FITs) |
|
Fail dangerous detected |
λDD |
260 |
Fail dangerous undetected |
λDU |
84 |
Fail safe detected |
λSD |
0 |
Fail safe undetected |
λSU |
145 |
No effect failure |
λNE |
135 |
For electromechanical components the situation is somehow different. They normally do not have an internal capability to assess if they are subject to a failure. That means the failure rate depends upon how the component is connected with the logic system (to the Safety PLC, for example). Moreover, since they are subject to wear-our from the first moment they are used, their reliability is not given with a Failure rate value but with a B10 value.
Electromechanical components and the B10D
For components having mechanical wear, a straight failure rate λ cannot be defined. That is the reason why the concept of BX% Life was introduced: B10D is the mean number of cycles until 10 % of the components failed dangerously.
If only the B10 of a component were available, B10D can be estimated as twice the B10 (50 % dangerous failure).
Sometimes, the component manufacturer provides the B10 value and the Ratio of dangerous failures (RDF). The relation among those parameters is the following.
To calculate the different types of λ, given a certain value of B10, we should look at the way the component is connected to the Logic Solver. We will discuss that subject in another article.
Linked to B10D is the number of operations a component does in a year: nop
The lower the RDF, the higher is the B10D and therefore the longer is the T10D. However, both ISO 13849-1 and IEC 62061 limit such a value; in particular, if the ratio of dangerous failure is estimated to be less than 0,5 (50 % dangerous failure), the useful lifetime of the component is limited to twice the T10.
T10D is linked to the number of operations nop by the following formula:
The ratio of dangerous failure is estimated as 50 % of dangerous failures, if no information is available.
How λD and MTTFD are derived from B10D
In Functional Safety and, in particular, in high demand Mode of Safety-related Control Systems, B10D is used to indicate the Reliability of components that do not have a constant failure rate.
Since a constant “surrogate” failure rate will be associated to those components, in order to limit the error on the calculation of the PFHD of the safety Function, the usage of the component will be limited to when it reaches the B10D number of operation. That means, the component must be replaced when B10D is reached, or earlier, if its Mission Time is shorter.
Since the cycle duration corresponds to the reciprocal of the operating frequency nop, the point in time T10D, at which the element has completed B10D cycles is:
Given a component with the unreliability function F(t),
the probability of dangerous failures that a component has, when T10D is reached is
But we know that the probability F(t=T10D) = 10%, therefore:
In case of a constant failure rate,
Therefore, the following is the formula to be used to calculate the MTTFD from B10D:
Conclusions
In this article we analysed the meaning of the three main parameters used to define the reliability of a component used in a Safety Critical System. There is a major distinction based upon the fact the component is subject to wear-out or not. In the former case the failure rate is never constant. That is a issue that is overcome by defining what is sometime called a surrogate failure rate. In case of electronic component, the situation is somehow simpler, since the failure rate characteristics are constant during a relatively long period of time. Moreover, the different types of failure rates depends only upon the component itself and its ability to detect if it is subject to a failure.