Like the Stuxnet attack in 2010 made the industry aware that targeted cyber-attacks on physical systems are a serious possibility, the TRISIS attack in 2017 targeting a safety system of a chemical plant also caused a considerable rise in awareness of the cyber-threat. Many asset owners started to reevaluate their company policies on how to connect BPCS and SIS systems.
That BPCS and SIS need to exchange information is not so much a discussion, all understand that this is required for most installations. But how to do this has become a debate within and between the asset owner and vendor organizations. The discussions have focused on the theme do we allow an integrated architecture with its advantages, or do we fallback to an interfaced architecture.
I call it a fallback because an interfaced connection between a Basic Process Control System (BPCS) and Safety Instrumented System (SIS) used to be the standard up to approximately 15-20 years ago.
Four main architectures are in use today for exchanging information between BPCS and SIS:
- Isolated / air-gapped – The two systems are not connected and limit their exchange of data through analog hard-wiring between the I/O cards of the safety controller and the I/O cards of the process controller;
- Interfaced – The process controller and the safety controller are connected through a serial interface (could also be an Ethernet interface, but most common is serial);
- Integrated – The BPCS and SIS are connected over a network connection, today the most common architecture;
- Common – Here the BPCS and SIS are no longer independent from each other and either run on a shared platform.
There are variations on above architecture, but above four categories summarize the main differences good enough for this discussion. Although security risk differs, because of differences in exposure, we cannot say that one architecture is better than another as business requirements and their associated business / mission risk differs. Take for instance a batch plant that produces multiple different products on the same production equipment, such a plant might require changes to its SIS configuration for every new product cycle, where a continuous process of a gas treatment plant or a refinery only incidentally require changes to the safety logic. These constraints do influence decisions on which architecture best meets business requirements. For the chemical, oil & gas, and refining industries the choice is mainly between interfaced or integrated the topic of the blog.
I start the discussion by introducing some basic requirements for the interaction between BPCS and SIS using the IEC 61511 as reference.
Following clauses of the IEC 61511 standard are describing the requirements for this separation.
“Where the SIS is to implement both SIFs and non-SIFs then all the hardware, embedded software and application program that can negatively affect any SIF under normal and fault conditions shall be treated as part of the SIS and comply with the requirements for the highest SIL of the SIFs it can impact.”
“Where the SIS is to implement SIF of different SIL, then the shared or common hardware and embedded software and application program shall conform to the highest SIL.”
A Safety Integrity Function (SIF) is the function that takes all necessary actions to bring the production system to a safe state when predefined process conditions are exceeded. A SIF can either be manually activated (a button or key on the console of the process operator) or acting when a predefined process limit (trip point) is exceeded, such as for example a high pressure limit. SIL stands for Safety Integrity Limit, it is a measure for process safety risk reduction, a level of performance for the SIF. The higher the SIL level the more criteria need to be met to reduce the probability on failure that the SIF is not able to act when demanded by the process conditions. In general a BPCS can’t meet the higher SIL (SIL 2 and above) and a SIS doesn’t provide all the control functions a BPCS has, so we end up with two systems. One system provides the control functions and one system enforces that the production process stays within a safe state, section 11.2.4 of the standard describes this. For a common architecture this is different because a common architecture has a shared hardware / software platform, therefore such an architecture is generally used for processes with no SIL requirements. Next clauses discuss the separation of the BPCS and SIS functions.
“If it is intended not to qualify the Basic Process Control System (BPCS) to this standard, then the basic process control system shall be designed to be separate and independent to the extent that the functional integrity of the safety instrumented system is not compromised.”
NOTE 1 Operating information may be exchanged but should not compromise the functional safety of the SIS.
NOTE 2 Devices of the SIS may also be used for functions of the basic process control system if it can be shown that a failure of the basic process control system does not compromise the safety instrumented functions of the safety instrumented system.
So BPCS and SIS should be separate and independent but it is allowed to exchange operating information (process values, status information, alarms) as long as a failure of the BPCS doesn’t compromise the safety instrumented functions. This requirement allows for instance the use process data from the SIS, reducing cost in the number of sensors installed. It also allows valves that combine a control and a safety function. And it allows for initiating safety override commands from the BPCS HMI to temporarily override a safety instrumented function to carry out maintenance activities. But functions that under normal operation fully meet the IEC 61511 requirements, can be maliciously manipulated when taking cybersecurity into account. An important requirement is that BPCS and SIS should be independent, the following sections discuss this:
The design of the SIS shall take into consideration all aspects of independence and dependence between the SIS and BPCS, and the SIS and other protection layers.
A device to perform part of a safety instrumented function shall not be used for basic process control purposes, where a failure of that device results in a failure of the basic process control function which causes a demand on the safety instrumented function, unless an analysis has been carried out to confirm that the overall risk is acceptable.
Below diagram shows the various protection layers that need to be independent. In today’s BPCS the supervisory function and control function are generally combined into a single system, as such form a single BPCS protection layer.
Figure 1 – Protection layers (Source: Center for Chemical Process Safety – CCPS)
The yellow parts in the diagram form the functional safety part implemented using what the standard calls Electrical/Electronic/Programmable Electronic Safety- related system (E/E/PES), or to be more specific systems such as: BPCS ( for the control and supervisory functions), SIS ESD ( for the preventative emergency shutdown function), SIS F&G (for the mitigative fire and gas detection function). Apart from these functions fire and alarm panels are used that interface with SIS, MIMIC panels also require interfaces, and public address general alarm systems (PAGA) require an interface. Though PAGA’s generally are hardwired connected, fire and alarm panels often use Modbus TCP to connect. These aren’t the only SIS that exist in a plant, also boiler management systems (BMS) and high integrity pressure protection systems (HIIPS) are preventative safety functions implemented in SIS.
Interfaced and integrated architectures
So far the discussion on the components and functions of the architecture let’s get into more detail to investigate these architectures. There are several reports published describing BPCS ó SIS architectures. For example the LOGIIC organization published a report in 2018 and ISA published a report in 2017 on the subject. The ISA-TR84.00.09-2017 report ignores architectures with a dedicated isolated network for SIF related traffic. Because the major SIS vendors support and promote this architecture and in my experience it is also the most commonly used architecture my examples will follow architectures defined in the LOGIIC report on SIS. LOGIIC defines following architectures as example of integrated and interfaced structures. (Small modifications are mine)
Figure 2 – Interfaced and integrated architecture principles
In the interfaced architecture there are two networks isolated from each other. The BPCS network generally has some path to the corporate network, while the SIS network remains fully isolated. The exchange of data between the two systems makes use of Modbus RTU where the BPCS is the master and the SIS is the slave. The example shows two serial connections for example using the RS-232C protocol, but a more common alternative is a multi-drop connection using the RS-485 protocol, where a single serial interface on the BPCS side connects with multiple logic solvers (safety controllers).
An integrated architecture has a common network connecting both the BPCS and SIS equipment. Variations exist where separation between a BPCS network segment and SIS network segment is created with a firewall between the two systems (A). Or alternatively a dual-homed OPC server is sometimes used when the communication is making use of the OPC protocol (B). See the diagrams below.
Figure 3 – Some frequently used integrated architectures
Because one of the main targets is the SIS engineering function, architecture C and D were developed. In this case the SIS engineering station can be connected to the SIF network or alternatively an isolated 3rd network segment is created. Different vendors support different architectures, but the four examples above seem to cover most of the installations. Depending on the communication protocol used for the communication between BPCS and SIS there might be a need for a publishing service. This is a function mapping the SIS internal addresses into a format the BPCS understands and uses for its communication.
In the C and D architecture an additional security function can be an integrated firewall in the logic solver restricting access to a configured set of network nodes (servers, stations, controllers) for a defined set of protocols. Above architectures are just some typicals relevant for comparing interfaced and integrated architectures. Actual installations can be more complex because of additional business requirements such as extending the control and SIF networks to a remote location and the use of instrument asset management systems (IAMS) for centrally managing field equipment.
The architectures above do not show a separation between level 1 and level 2 for the process controller and the logic solver. All controllers / logic solvers are shown as connected to level 2, this is not necessarily correct. Some vendors implement a level 1 / level 2 separation using a micro firewall function for each controller with a preset series of policies that limit access to a sub set of approved protocols and allow for throttling traffic to protect against high volumes of traffic. Other vendors use a dedicated level 1 network segment where they can enforce network traffic limitations, and sometimes it is a mix depending on the technology used. There are also vendor implementations with a dedicated firewall between the level 2 and level 1 network segment. Since I want to discuss the differences independent of a specific vendor implementation I ignore these differences and consider them all connected as shown in the diagrams.
Let’s now compare the pros and cons of these architectures.
A comparison between interfaced and serial architectures
For the discussion I assume that the attacker’s objective is to create a functional deviation in the Safety Integrity Function (SIF) to either deviate from the operation intent / design intent (What I call Loss of Required Performance, for example modified logic or trip point) or making the SIS fail when it needs to act. Just tripping the plant can be done in many ways, the more dangerous threat is causing physical damage to the production system.
If we want to compare the security of an interfaced and integrated architecture the first thing is to look at the possibilities to manipulate the BPCS <=> SIS traffic. For the interfaced architecture this traffic passes the serial connection, for the integrated architecture the pink high lighted part of the network passes this traffic.
Figure 4 – BPCS <=> SIS traffic
One potential attack scenario is that the attacker attempts to create a sink hole by intercepting all traffic and dropping it. In an integrated architecture relative simple malware can do this using ARP poisoning as a mechanism to create a Man-In-The-Middle attack (MITM). The malware hijacks the IP address of one or both of the communicating BPCS/SIS nodes and just drops the message. This is possible because level 1 and level 2 traffic are normally in the same Ethernet broadcast domain, so actual messages are exchanged using Ethernet MAC addresses. A malware infection on any of the station and server equipment within the broadcast domain can perform this simple attack. Possible consequences of such a simple attack would be that SIF alarms are lost, SIS status messages are lost, SIS process data is lost, and messages to enforce an override will not arrive at the logic solver.
The noticeability of such an attack is relatively high, but this differs depending on the type of traffic and additional ingenuity added by the attacker. The loss of a SIF alarm would not be detected that quickly, but the loss process data would most likely be immediately detected. So selectively removing traffic can make the attack less noticeable. Also the protocol used is of importance for the success of this attack, a Modbus master (BPCS) would quickly detect if the slave (SIS) would not respond and generate a series of retries and ultimately raises an alarm. Similar detection mechanisms exist when using vendor proprietary protocols. But in those situations where the SIS can initiate traffic independent of the BPCS, than a SIF alarm might get lost if there is not a confirmation from the BPCS is required causing that the process operator might not receive the new alarm.
More ingenious attacks can make use of the MITM mechanism to capture and replay BPCS ó SIS traffic. Replaying process data messages would essentially freeze the operators information, replaying override messages might override or undo an override command. And of course an attacker can overload an operator this way with false alarms and status change messages causing the potential for missing an important alarm. The danger of these two attacks is that the attacker doesn’t need to know that much specific system information to be successful.
Enhancing the attack even more by intercepting and modifying traffic or injecting new messages is another option. In that scenario the attacker can attempt to make “intelligent” malicious modifications. There are various ways to reduce the likelihood of a MITM attack, but the risk remains in the absence of secure communications that authenticate, encrypt, confirms reception of a message, and includes a timestamp to validate if the message was send within a specific time window. If we analyze the protocols used today for communication between BPCS and SIS most don’t offer these functions.
Above MITM scenarios are only possible in an integrated architecture, in an interfaced architecture the communication channel is isolated from the control network by the process controller. An attacker would need physical access to the cabling to perform a similar attack.
Apart from attacking the traffic it is possible to directly attack the logic solver if it would have an exposed software vulnerability. In the integrated architecture the logic solver is always exposed to this threat and only testing, limiting network access to the logic solver, and using software validation techniques can offer protection against this threat.
In an interfaced architecture, the logic solver is no longer exposed to an external connected network but this does not offer a guarantee that such an attack can’t happen. The SIS engineering station might get infected with malware and execute an attack. Also in the Stuxnet attack the target system was on an isolated network, this didn’t stop the attackers to penetrate. But it certainly requires more skills and resources to design such an attack for an interfaced architecture.
In an integrated architecture the SIS engineering station might be exposed. In architecture A above the exposure depends very much on the firewall configuration, but there might be connections required for the antivirus engine, security patches, and in some cases for the publishing function. In architectures C and D the exposure of the SIS engineering station is equivalent to the exposure in an interfaced architecture. However it can still be infected by malware when files are transferred, for example to support this publishing function, or through some supply chain attack. But in general architecture D, where the SIS engineering station is isolated from the SIF traffic reduces the exposure for a number of possible cyber-attacks. Integrated architectures C and D are considered more secure, but how do they compare with an interfaced architecture?
Two main cyber security hazards seem to make a difference:
a) Manipulation of BPCS <=> SIS traffic;
b) Exploitation of a logic solver resident software vulnerability;
To mitigate / reduce the risk of scenario a) the industry needs secure communications between the two systems. Protocols used today such as Modbus TCP, Classic OPC, and most of the proprietary protocols don’t offer this.
To mitigate / reduce the risk of scenario b) we would need mechanisms that prevent or detect the insertion of malicious code. These have partially been implemented, but not completely and not by all vendors as we can learn from the TRISIS incident.
So overall we can say that the exposure in an interfaced environment is smaller than the exposure in an integrated environment. Since exposure directly correlates with likelihood we can say that the risk in an interfaced architecture is smaller than the risk in an integrated architecture. But security is not all about risk reduction, in the end the cost of security something must be weight against the benefits the business has for selecting an architecture.
I think there is also a price to pay when the industry moves back to interfaced architectures.
Possible disadvantages of a change to interfaced solutions can be:
· Reduced communication capabilities, e.g. the bandwidth of a serial connection is lower than the bandwidth of an Ethernet connection;
· The support for Sequence Of Event (SOE) functionality over serial connections would be hindered by not having a single clock source;
· The cost of the architecture is higher because the need for additional components and being constrained by cable length restrictions;
· Sharing instrument readings between SIS and BPCS is limited because of the relatively low speed serial connectivity, this can increase the number of field instruments required;
· If Instrument Asset Management Systems (IAMS) is used these systems can no longer benefit from the HART pass-through technology and need to use the less secure HAT multiplexers;
· The cost for engineering and maintaining a Modbus RTU implementation is higher because of the more complex address mapping;
· Future developments w.r.t. the Industrial Internet Of Things (IIoT) would become more difficult for safety instrumentation related data. For example the new developments around Advanced Physical Layer (APL) technology are hard to combine with an interfaced architecture.
So generally serial connections were seen as a thing of the past and some vendors stopped supporting them on their newer equipment. However the TRISIS attack has put the interfaced architecture very prominent on the table again, specifically in the absence of secure communications between BPCS and SIS.
The cost of moving to an interfaced architecture must be weight against the lost benefits for the business. How high is the residual risk of an integrated architecture compared to the residual risk of an interfaced architecture. Is moving from integrated to interfaced justified by the difference and what are the alternatives?
Author: Sinclair Koelemij
Date: May 9, 2020