Reliability Engineering and Systematical Design Optimization

R&D engineers are adept at developing products that work correctly, they are also good at calculating design lifetime. So, what do you need a reliability engineer for?

A reliability engineer focuses more on service life, reliability and robustness, he knows a lot about statistics, failures and causes. One of his frequently used tools to help designers is DFMEA (Design Failure Modes and Effect Analysis).

The reliability engineer can also “design” tests that investigates and substantiates the design life – e.g. ALT (Accelerated Lifetime Test), Endurance Tests or Robustness Tests. This is done to demonstrate service life at as early a stage as the design phase.

Figure 1 is an example of a DFMEA

What is the difference between design life and service life?

The design life is, a little simply put, the lifetime calculated from formulas and tables, where service life is the actual life time. During the design phase service life is often tested using reliability tests, for example ALT or endurance tests.

What is the difference between reliability and availability?

Reliability can be described as the amount of time a population of a given component can function on average. Reliability can, for example, be described in a distribution in which the likelihood that it works after x number of years is shown.

Availability is a measure of how large a share of the total time a system is available when desired, i.e. how much time it is operational when you want it to produce.

Figure 2 is an example of a lifetime distribution

My approach focuses on availability rather than reliability as I feel it is a more operational approach: reliability + the corrective action = availability. The smart thing is that maintenance is thought into the design and you have an extra tool in the toolbox when you optimize your design.

What is the difference between reliability and robustness?

Reliability is a measure of how well the product will function in the expected time period.

A robust product is one where you’ve removed “all” the weak joints of a component or product. It can be done in two ways by doing robustness testing or by correcting the failures as they are discovered after the product has been shipped in the market – the first one is the cheapest …

If you do not want to spend too much money on tests, it is usually a good idea to choose robustness tests rather than reliability tests. This is because robustness testing is cheaper, takes significantly less time to perform and removes weaknesses, which leads to an improvement in reliability – and thus availability.

Examples of how the reliability engineer can be a valuable participant in the design phase

A reliability engineer can support the design phase by controlling the availability and finding the optimal performance between the different subsystems – I call it respectively allocation (system level) and optimization (subsystem level).

For both of them, it may be for the following performance goals: reliability, availability, product price or OPEX. It can also be a combination.

Allocation

For allocation, the goal is to have the optimal distribution of performance between the different subsystems required to support the desired system-level final goal. As an example, availability is used as a performance indicator.

One can start with equal availability between subsystems and for each subsystem, you also identify feasibility – how easy or how difficult it is to increase availability.

When you control the start availability and feasibility, the owners of the individual subsystems can “trade” with each other resulting in the desired system-level availability at the lowest cost.

Figure 3 shows a hypothetical example of allocation

Optimization

With optimization, you get the highest availability for a subsystem optimized relative to component reliability, redundancy, maintenance strategy or a combination.

Figure 4 shows a hypothetical example of subsystem optimization

Reliability model

To discover my performance indicators, I build a reliability model of the individual subsystems. Here, each component is described by a reliability model, the corrective action that causes the system to run again when it has failed as well as the maintenance strategy.

When these things are added to the model and it is otherwise correctly connected, the subsystem is simulated. Since each component is described by a distribution (reliability model) and the corrective action, you can simulate for as long a period of time as you wish.

The great advantage of a reliability model is once it is built, it only takes the simulation time to get a result. Without this model, one would have to wait for the system to operate for an extended period of time to get the same insight.

If you are satisfied with the result you will stop here, but if you need further improvements you make changes, update the reliability model, run the simulation, analyze and evaluate, etc.

CrossFields