Hazard and Risk analysis | ZHAW Institut für Angewandte Mathematik und Physik IAMP

The manufacturer must meet all the legal requirements necessary to sell a product on the market. These requirements can differ in their details, depending on the product type, but share the common trait that the manufacturer must be able to present a hazard and risk analysis to prove that their product does not expose either humans or the environment to intolerable risks.

Hazard and risk analyses should be performed as early as in the initial phases of the development process in order to define the product requirements and design in a way that achieves an acceptable level of risk.

These analyses are particularly important for systems critical to safety, in which safety-related requirements naturally assume an essential role.

Along with the classical methods of analysis used by us, such as FMEA or FTA, we conduct research in the area of theoretical system modeling and analysis of complex technological and socio-technological systems.

We will support you in the planning and performance of hazard and risk analyses, directly in concrete projects or by providing targeted training.

Methodological diversity

The risk analysis applies diverse and sometimes long-established methods, which can be used depending on the development phase.

Hazard analyses can already be performed in the early phases of development based on the initial concept. A Preliminary Hazard Analysis (PHA) is recommended here to recognize and evaluate potential hazards caused by the system and to defuse them at this stage by adapting the system design or requiring additional safety measures.

Dangerous system behavior can naturally be triggered by different causes, which in turn enables various risk reduction measures to be defined which can be adapted to these causes. A "top-down" decomposition of the hazards by cause can be carried out, for example, using a Fault Tree Analysis (FTA). This route can be used to specify more detailed risk reduction measures.

The Failure Mode and Effects Analysis (FMEA) can be used if the system design is already known in detail. Starting from the fault modes of the individual system units, an evaluation is performed from the "bottom up" as to how these fault modes affect the overall system. This makes it possible to check, for example, if all potentially hazardous component faults in an electronic circuit are adequately covered by the risk reduction measures.

Along with these conventional methods, a wide range of additional tools are available, such as ETA, HAZOP, HACCP, LOPA, and various methodological variations such as FMECA or FMEDA.This wide variety of methods inevitably leads to many questions: Which of these methods can be beneficially used in which phases of a development process? Which methods can be used solely for purely qualitative analyses and which can also be used for quantitative analyses? How can the results of one method be purposefully refined by other methods?

System-theoretical modeling and analysis

Accidents are typically seen as a chain of individual events where the trigger event is attributed to component failure or human error. Such an approach is still appropriate for many electro-mechanical systems and established methods such as FTA (Fault Tree Analysis), ETA (Event Tree Analysis) and FMEA (Failure Mode and Effect Analysis) are ideally suited to analyze and evaluate any potential undesired behavior of such Systems

Systems-Theoretic Process Analysis (STPA)

The conventional methods of analysis, however, are not adequate for the analysis of complex, dynamic, and often very strongly software-based systems such as those currently used to control machines and installations. The causes of accidents can no longer be exclusively attributed to component failures and a chain of individual faults; instead, the interaction between properly functioning components (in terms of their specification) and thus the emergent behavior of a system whose individual components don't fail, must be analyzed as a whole.

On the basis of this knowledge, Prof. Nancy Leveson of MIT (Massachusetts Institute of Technology) suggests that the aspect of safety be considered from the viewpoint of system theory. To this end, Prof. Leveson developed an innovative analysis methodology referred to as Systems-Theoretic Process Analysis (STPA).

STPA is a hazard analysis method which investigates safety as an emergent property of a holistically considered socio-technological system. Human actions can also be included in the analysis as well as programmable units. Moreover, STPA is designed as a top-down method and is therefore especially suited to accompany the development process. This makes the Safety Guided Design paradigm tangible.

STPA hierarchical control structure

An example of a hierarchical STPA control structure with two controls, control actions, and feedback channels. This functional representation is the STPA starting Point.

The starting point for STPA is that point referred to as the hierarchical control structure, a representation of the system from a functional aspect, in which the flow of control and information is explicitly represented by the Subsystems.

Every controller in the hierarchical control structure, be it a person or an automated system, bears responsibility for the implementation of certain system tasks and can achieve this by using control actions to exert influence on the lower levels of the hierarchy. The decision about the output of control actions is made on the basis of the controller's process model. Along with the causal connections between system and environment, this process model also includes a representation of the current system status. The system status is updated by responses (feedback) from lower-level controllers or the controlled process itself. If the controller is an automated system, then the process model can consist, for example, of a control algorithm implemented in the software. If, however, the controller is a human such as the operator of a technical system, then its process model is based on its experience in handling the machine and its interpretation of the system standards.

STPA process

The STPA analysis itself is divided into two sub-steps:

STPA Step 1: The goal of the first step is to identify potentially dangerous control actions (unsafe control actions). Such unsafe control actions may already be able to be eliminated by adjustments in the system design.

STPA Step 2: Possible causes and scenarios, which can lead to unsafe control actions, or rather to their consequences, are determined in the second step. Additional safety measures for risk reduction can be specified here, or existing safety systems investigated for possible gaps.

We actively research the further development of these methods and work on making them more concrete in the context of diverse industrial sectors with the goal of promoting their usability in an industrial environment.

More about STAMP, STPA, and CAST

Project Examples:

Use of STPA for hazard and risk analysis of SmartRail 4.0 (site in German)
Application of STPA to digital instrumentation and control systems of a Nuclear Power Plant (site in German)
Risk analysis using STPA for the proton therapy facility Gantry 2 at the Paul Scherrer Institute (site in German)

Further Informationen:

All about the 4th European STAMP Workshop 2016 hosted by the IAMP in Zurich.
More about the European STAMP Workshop and Conference