Cyber security and process safety, how do they converge?


There has been a lot of discussion on the relationship between cybersecurity and process safety in Industrial Control Systems (ICS). Articles have been published on the topic of safety / cybersecurity convergence, and on the topic to add safety to the cybersecurity triad for ICS. The cybersecurity community practicing securing ICS seems to be divided over this subject. This blog approaches the discussion from the process automation practice in combination with the asset integrity management practice as practiced in the chemical, refining, and oil and gas industry. The principles discussed are common for most automated processes, independent where applied. Let’s start the discussion by defining process / functional safety, asset integrity and cybersecurity before we discuss their relationships and dependencies.

Process safety is the discipline that aims at preventing loss of containment, the prevention of unintentional releases of chemicals, energy, or other potentially dangerous materials that can have a serious effect to the production process, people and the environment. The industry has developed several techniques to analyze process safety for a specific production process. Common methods to analyze the process safety risk are Process Hazard Analysis (PHA), Hazard and Operability study (HAZOP), and Layers Of Protection Analysis (LOPA). The area of interest for cybersecurity is what is called Functional Safety, the part that is implemented using programmable electronic devices. Functional safety implements the automated safety integrity functions (SIF), alarms, permissives and interlocks protecting the production process.

Figure 1 – Protection layers (Source: Center for Chemical Process Safety (CCPS))

Above diagram shows that several automation system components play a role in functional safety, there are two layers implemented by the basic process control system (BPCS), and two layers implemented by the safety instrumented system (SIS): the preventative layer implemented by the emergency shutdown function (ESD) and the mitigative layer implemented by the fire and gas system (F&G). The supervisory and basic control layers of the BPCS are generally implemented in the same system, therefore not considered independent and often shown as a single layer. Interlocks and permissives are implemented in the BPCS, where the safety integrity functions (SIF) are implemented in the SIS (ESD and F&G). Other functional safety functions exist such as the High Integrity Pressure Protection System (HIPPS) and Boiler Management System (BMS). For this discussion it is not important to make a distinction between these process safety functions. Important is too understand that the ESD safety function is responsible to act and return the production system to a safe state when the production process entered a hazardous state, where the F&G safety function acts on detection of smoke and gas and will activate mitigation systems depending on the nature, location, and severity of the detected hazard. This includes such actions as initiating warning alarms for personnel, releasing extinguishants, cutting off the process flow, isolating fuel sources, and venting equipment. The BPCS, ESD and F&G form independent layers, so their functions should not rely on each other but they don’t exist in isolation extending their ability to prevent or mitigate an incident by engaging with other systems.

Asset integrity is defined as the ability of an asset to perform its required function effectively and efficiently whilst protecting health, safety and the environment and the means of ensuring that the people, systems, processes, and resources that deliver integrity are in place, in use and will perform when required over the whole life-cycle of the asset.

Asset integrity includes elements of process safety. In this context, an asset is a process or facility that is involved in the use, storage, manufacturing, handling or transport of chemicals, but also the equipment comprising such a process or facility. Examples of process control assets include pumps, furnaces, tanks, vessels, piping systems, buildings, but also includes the BPCS and SIS among other process automation functions. As soon as a production process is started up the assets are subject to many different damage and degradation mechanisms depending on the type of asset. For electronic programmable components this can be hardware failures, software failures, but today also maliciously caused failures by cyber-attacks. From an asset integrity perspective there are two types of failure modes:

  • Loss of required performance;
  • Los of ability to perform as required;

The required performance of an asset is the successful functioning (of course while in service) achieving its operational / design intent as part of a larger system or process. For example in the context of a BPCS the control function adheres to the defined operating window, such as sensor ranges, data sampling rates, valve travel rates, etc. The BPCS presents the information correctly to the process operator, measured values accurately represent the actual temperatures, levels, flows, and pressures as present in the physical system. In the context of a SIS it means that the set trip points are correct, that the measured values are correct, and that the application logic acts as intended when triggered.

Loss of ability is not achieving that required performance. An example for a BPCS is loss of view or loss of control, the ability fails where in loss of required performance the ability is present but doesn’t execute correctly. Loss of ability is very much related to availability and loss of required performance to integrity. Never the less I prefer to use the terminology used by the asset integrity discipline because it more clearly defines what is missing.

The simplest definition of cybersecurity is that it is the practice of protecting computers, servers, mobile devices, electronic devices, networks, and data from malicious attacks. Typical malicious cyber-attacks are gaining unauthorized access into systems, and distributing malicious software.

In the context of ICS we often talk about Operational Technology (OT), which is defined as the hardware and software dedicated to detecting and / or causing changes in a production processes through monitoring and/or control of physical devices such as sensors, valves, pumps, etc. This is a somewhat limited definition because process automation systems contain other functions such as for example dedicated safety, diagnostic and analyzing functions.

The term OT was introduced by the IT community to differentiate between cybersecurity disciplines that protect these OT systems and those that protect the IT (corporate network) systems. There are various differences between IT and OT systems that justified to create this new category, though there is also significant overlap frequently confusing the discussion. In this document the OT systems are considered the process automation systems, the various embedded devices such as process and safety controllers, network components, computer servers running applications and stations for operators and engineers to interface with these process automation functions.

The IT security discipline defined a triad based on confidentiality, integrity, availability (CIA) as a model highlighting the three core security objectives in an information system. For OT systems ISA extended the number of security objectives by introducing 7 foundational requirement categories (Identification and authentication control, Use control, System integrity, Data confidentiality, Restricted data flow, Timely response to events, Resource availability) to group the requirements.

Now we have defined the three main disciplines I like to discuss the topic if we need to extend the triad with a fourth element safety.

Is safety a cybersecurity objective?

Based upon the above introduction to the three disciplines and taking asset integrity as the leading discipline for plant maintenance organizations, we can define three cybersecurity objectives for process automation systems:

  • Protection against loss of Required Performance;
  • Protection against Loss of Ability;
  • And protection against Loss of Confidentiality.

If we have established these three objectives we also established functional safety. Not necessarily process safety because this also depends non-functional safety elements. But these non-functional safety elements are not vulnerable to cyber attacks other than potentially revealing these elements through loss of confidentiality. Based upon all information on the TRISIS attack against a Triconex safety system, I believe all elements in this attack have been covered. The loss of confidentiality can be linked to the 2014 attack that is suggested to be the source of the disclosure of the application logic. The aim of the attack was most likely causing a loss of required performance by modifying the application logic that is part of the SIF. The loss of ability, the crash of the safety controller, was not the objective but an “unfortunate” consequence of an error in the malicious software.

Functional safety is established by the BPCS and SIS functionality, the combination of interlocks, permissives, and the safety integrity functions contribute to overall process safety. Cybersecurity contributes indirectly to functional safety by maintaining above three objectives. Loss of required performance and loss of ability would have a direct consequence to the process safety, loss of confidentiality might lead over time to the exposure of a vulnerability or contribute to the design of the attack. Required performance is overlapping what a process engineer would call operability, operability also includes safety so also from this angle nothing is missing in the three objectives.

Based upon above reasoning I don’t see a justification for adding safety to the triad, the issue is more that the availability and integrity objectives of the triad should be aligned with the terminology used by the asset integrity discipline to include safety. Which would make OT cybersecurity different from IT cybersecurity.

Author: Sinclair Koelemij

Date: April 2020

Geef een reactie