ISA Interchange

Welcome to the official blog of the International Society of Automation (ISA).

This blog covers numerous topics on industrial automation such as operations & management, continuous & batch processing, connectivity, manufacturing & machine control, and Industry 4.0.

The material and information contained on this website is for general information purposes only. ISA blog posts may be authored by ISA staff and guest authors from the automation community. Views and opinions expressed by a guest author are solely their own, and do not necessarily represent those of ISA. Posts made by guest authors have been subject to peer review.

All Posts

Optimize Your Industrial Automation System with a Resilient Design Strategy

In a previous post regarding good automation system design concepts considered the issue of robust design at the controller and communication levels. This time we shift the focus to software best programming practices, both at the controller and the operator interface levels, which can help result in a solid automation system.Optimize Your Industrial Automation System with a Resilient Design Strategy

Probably the No. 1 programming recommendation is to validate all information before using it. Within the control system platform, validate any process signal data, classic or networked, to confirm it has good quality. This especially applies to ensuring that analog input values are scaled properly within the allowable range. Incorporate debounce logic on discrete inputs so the brief time delay will ensure that a positive field signal has been received.

Validate operator inputs

Many kinds of data are entered by operators via an HMI/OIT or other device, in order to command the control system operation. Always validate operator inputs. This primarily applies to checking that operator-entered setpoint values are within a legal range. However, it can even apply to discrete push-button presses. For instance, incorporating a brief “push and hold” delay on a start button can guard against accidental pushes. Furthermore, multi-mode push-buttons should be made mutually exclusive and configured so that the safest choice (if possible) dominates. For a start/stop pair of buttons, the “stop” function would prevail over the “start” function.

When it comes to validating operator-entered data, many HMI/OIT packages offer this capability. An even more rigorous method is to limit-test the data within the control logic before accepting it. Individual project needs will dictate whether improper data is clamped at the limit, or rejected in a way that warns the operator. For logic that calculates derived values during runtime, if these values are used for subsequent operations they need to be validated just as if they were operator-entered. The most common validation scenarios are ensuring a value is not over-range or under-range, or making sure a value is not negative.

Consequences of bad data

What are the consequences of bad data? Improperly ranged or negative values can cause control loops to wind up. An unexpected zero can trigger a processor-halting divide-by-zero error in a calculation. Or, an improperly set or incremented value used as an indirect array address can actually point to an illegal location outside the valid range, causing a processor to stop running.

At a higher level, functional systems should be programmed to enter a safe state (usually “off”) upon system boot up, or on any critical error. Startup routines can be created in order to initialize data and values, and to drive control logic to a safe state. Rarely should sequences automatically restart after an alarm condition is removed. Instead, consider a two-step procedure where operators clear the error, then trigger a restart.

During detailed design, consider adding some enhanced alarming that can help operators identify unusual trouble. For instance, a system with a pump filling a tank may have high level, low level, and pump fail alarms. If it is known that the tank should only take 5 minutes to fill, then a “slow fill” alarm can be incorporated. This alarm will not specifically define why a slow fill is happening, but could prompt the operator to go look for a broken hose or fitting that is spilling water to the floor.

Configure software to self-recover

In addition to disallowing any invalid operational modes, software should be configured to self-recover itself to a safe mode if it is ever inadvertently driven to an illegal mode. Sometimes it makes sense to give the operator a “reset” or “abort” control that can stop and re-initialize a problematic sequence. Keep in mind that some would consider this type of logic to be a Band-Aid intended to make up for other poor programming practices.

Fault and alarm indications on HMI/OIT stations must be clear and understandable. Cryptic messages or codes cannot be reliably acted upon. System reactions to operator inputs must be responsive enough to prevent operators from making multiple selections which could trigger undesired operation. Just as with consumer devices like phones and DVRs, a lagging response will cause the frustrated user to keep pressing buttons fruitlessly.

Develop a test plan

How do you know if your good engineering efforts are sufficient to defend against the unexpected? Test, test, test! Develop a test plan, preferably around the time the system is designed, so that it tests all key features. Attempt to trigger or simulate various failures and potentially illegal operator actions. Execute the test plan, and don’t be afraid to use it as a springboard for developing additional specific test cases that look useful. Make some test actions faster and slower than typical to search out bad interactions.

We started this blog series comparing an automation engineer’s tasks with those of a driver on a challenging road. In both cases, it is clear that training, planning, practice, and experience will lead to the most successful outcome. Not every bad situation can be prevented, but a multi-layered approach for defending and reacting to the unknown is the best bet. There are usually few arguments against building a resilient automation system, which can safely and defensively respond to non-normal conditions. Always be on the lookout for opportunities to improve your designs by challenging them with various conditions that flush out potential weaknesses.

Paul Darnbrough
Paul Darnbrough
Paul Darnbrough, P.E., CAP, is a principal at ControlsPR and previously worked in the Automation Solutions Group at MAVERICK Technologies, a Rockwell Automation company. He has more than 25 years of experience in engineering, documentation, and construction of automated industrial and process control systems. Paul has worked with clients ranging in size from small single-owner operations up to Fortune 500 companies and government agencies, involving operations in the plastics, food, dairy, chemicals, material handling, discrete manufacturing, water treatment, and pharmaceutical industries.

Related Posts

How Did Automation Professionals Benefit from ISA in 2024?

The International Society of Automation (ISA) is proud to be the professional home of thousands of member...
Kara Phelps Dec 17, 2024 9:30:00 AM

Ensuring RCM or DCS Redundancy and Its Security in a Complex Industrial Environment

In industrial automation, remote control managers (RCM) or distributed control systems (DCS) are critical...
Ashraf Sainudeen Dec 13, 2024 10:00:00 AM

ISA Podcast Explores Automation and Smart Agriculture

The International Society of Automation (ISA) podcast, Podomation, curates and shares insightful discussi...
Kara Phelps Dec 10, 2024 11:00:00 AM