ISA Interchange

What are the Biggest Mistakes You Have Seen? Part 1

Written by Greg McMillan | May 10, 2022 9:30:00 AM

The following discussion is part of an occasional series, "Ask the Automation Pros," authored by Greg McMillan, industry consultant, author of numerous process control books, and 2010 ISA Life Achievement Award recipient. Program administrators will collect submitted questions and solicits responses from automation professionals. Past Q&A videos are available on the ISA YouTube channel. View the playlist here. You can read all posts from this series here.

Looking for additional career guidance, or to offer support to those new to automation? Sign up for the ISA Mentor Program.

See Part 2 here. See Part 3 here. See Part 4 here. See Part 5 here.

Mistakes often offer the greatest learning experiences. Mentor program participants share what they have seen in this four-part series. We are fortunate to have an extensive list in Part 1 from Luis Alberto Navas Guzmán, a senior process control engineer, who has been an enthusiastic participant in the ISA Mentor Program promoting its use in South America and doing Mentor Q&A Posts and Control Talk columns. 

Greg McMillan’s Question 

What are the biggest mistakes you have seen in automation system design, configuration, calibration, installation, commissioning, and maintenance? What were the consequences and the fixes and what can be done to prevent future occurrences? 

Luis Alberto Navas Guzmán’s Answer 

These experiences are part of years of experience in different companies and countries.

 

Installation 

 

1. Resistance Temperature Detector (RTD) installed in an oil pipeline without the thermowell.

  • Consequences: RTD maintenance is only possible when the plant is stopped, and the pipeline is depressurized.
  • What can be done? Quality assurance during installation and qualified technicians.

2. Water column pressure transmitter for an oil tank without blocking valve.

  • Consequences: Transmitter maintenance is only possible when the tank is empty.
  • What can be done? Engineering quality assurance, quality assurance during installation, and qualified technicians.

3. Instrumentation with the unused inlets with plastic caps.

  • Consequences: Over time, the sun degrades the plastic caps and, at the end, moisture and water damage the instrument.
  • What can be done? Close the unused instrument inlet with metallic plugs, quality assurance during installation, and qualified technicians.

4. A mid-size regulatory valve with the wrong failure condition. It should be fail-open, and it was purchased as last-position fail.

  • Consequences: Undesirable last position fail with the associated risk.
  • What can be done? Engineering quality assurance.

5. Equipment/devices installed within cabinet without the minimum distances recommended by the original equipment manufacturer (OEM).

  • Consequences: Mean time between failures (MTBF) reduced and poor heat dissipation.
  • What can be done? Engineering quality assurance (read the installation manual!)

6. Wiring junction boxes (JBs) or any kind of box installed on field installed without cable glands/plugs.

  • Consequences: Ingress protection reduced (dust, moisture).
  • What can be done? Quality assurance during installation and qualified technicians.

7. Lack of names or lack of naming convention for equipment or cabinets installed.

  • Consequences: Lack of order and consistency.

  • What can be done? Good engineering practices—hardware design specification, quality assurance, etc.

8. Level control valve undersized.

  • Consequences: The bypass manual valve partially opens all the time. Poor control. Flow never reaches zero, even if needed, because of bypass valve opening.

  • What can be done? Quality assurance during process engineering phase.

9. Electrical power socket not identified and/or uninterruptible power supply (UPS) power sockets easily accessible.

  • Consequences: During a cleaning activity, an electrical vacuum cleaner was connected to a UPS power socket within a controller cabinet. These kind of UPS outlets are not intended for that kind of load. Then, the circuit breaker of that circuit was opened due to overcurrent of the inductive load and the controller lost its power supply, disrupting the process.

  • What can be done? A good design, clear identification, restrict access to control cabinets and critical rooms. A good and solid permit work process in place.

10. Objects found within pipelines (this has happened a couple of times in different plants in different industries).

  • Consequences: During commissioning or startup of existing or new plants, objects such as safety helmets or safety shoes were found within the pipelines, affecting the process starting and process/pipelines/instrumentation integrity
  • What can be done? Construction quality assurance and qualified technicians.

 

 Configuration 

 

1. Logic execution order configured wrong.

  • Consequences: Unexpected logic behavior and eventually an impact on field/safety.
  • What can be done? Qualified programmers, software quality assurance, checking during factory acceptance test (FAT). Generally, the execution order should follow the natural data flow.

2. Analog outputs and digital outputs fail condition configuration leave by default (applicable when the output cards lose communication with the central processing unit [CPU]).

  • Consequences: Potential impact on safety and product quality.
  • What can be done? Qualified programmers, considering this kind of failure during the identification and analysis of risk, process hazard analysis (PHA), etc.

3. Control systems alarms leave by default.

  • Consequences: Unsafe condition due to no rationalized alarms.
  • What can be done? Functional design specification, FAT testing, alarms specifications, qualified programmers, ISA 18.2.

4. Not all-possible logical values covered in logics configuration.

  • Consequences: Unexpected logic behavior, human machine interface (HMI) wrong animation and eventually an impact on field/safety.
  • What can be done? Close all the gaps covering all possible options. qualified programmers, FAT testing.

5. Lack of naming convention of logic nodes, programmable logic controllers (PLCs), variables, graphics, etc.

  • Consequences: Lack of order and consistency. Long time to reverse engineer or perform troubleshooting or understand the logic.
  • What can be done? Good engineering practices. Software design specification, quality assurance, etc.

6. Non optimal communication mapping between the main control system and the package units.

  • Consequences: Lack of order and consistency. Long time to reverse engineer or perform troubleshooting. Not optimal use of data packages.

  • What can be done? Good engineering practices, FAT testing, software design specification, quality assurance, and qualified programmers. Use integers instead of bits, and, if possible, define subsets of registers for each functionality.

7. Critical variable values were not maintained after a controller cold restart.

  • Consequences: Undesired values when the CPU resumes its operation after a cold-restart, and in turn undesired response of the process under control. Some examples could be proportional-integral-derivative (PID) setpoints, PID tuning settings, or any other variable considered critical for the process.

  • What can be done? Use the mechanisms/methods that the system/technological platform has to preserve the values.

8. Numerical input fields in HMI without limits.

  • Consequences: Undesired or nonsense values with unexpected logic behavior and in turn undesired response of the process under control.
  • What can be done? Configure input limits according to the specific application. Clarify the input limits to the operator.

9. Inherently sequential process configured with function blocks.

  • Consequences: Long time to reverse engineer or perform troubleshooting or understand the logic.
  • What can be done? Use Sequential Function Chart (SFC), or the programming language that better fits the specific needs..

 PID 

 

1. All PID modes (e.g., automatic, manual, remote setpoint, remote output) enabled when not needed and/or not selecting the correct default mode.

  • Consequences: Prone to the wrong mode being selected by the operator.
  • What can be done? Enable only the required modes and select the correct default mode. Software design specification, FAT testing, qualified process control engineers, etc.

2. Wrong set point-process variable (SP-PV) tracking option selected.

  • Consequences: Wrong SP and potential impact on process safety and/or product quality.
  • What can be done? Check and verify the required type of tracking or setpoint initialization for the specific application and operating mode.

3. PID status options not checked.

  • Consequences: Undesirable effects on PID control under certain bad/uncertain status circumstances.
  • What can be done? Check/review the PID status options as required, software design specification, FAT testing, and qualified process control engineers.

4. Wrong perception of the operator and/or unexperienced control engineer (PID action changed from direct to reverse, then the process is upset).

  • Consequences: Process upset with downtime.
  • What can be done? Double check and confirm what the operator says. Ask for a more seasoned control engineer. Confirm action by tests in manual. For example, note whether the direction of the PV change for a PID algorithm output change in manual that includes all signal reversals in the configuration and field is reverse or direct and choose the control action that provides the opposite direction for correction of the observed PV change in test. The changes in the output by the PID algorithm considered in the tests are before any signal reversal of the output caused by specification of the “increase-to-close” valve action in the PID configuration. A seasoned control engineer is needed to do the tests.

5. Wrong characterization on a split-range control strategy.

  • Consequences: Two control valves working like on-off valves.
  • What can be done? Make a good design first, test on simulation first.

 

Measurements

 

1. Mismatch between instrument span and corresponding control system´s range.

  • Consequences: Wrong measurement and its associated consequences.
  • What can be done? Quality assurance during FAT, commissioning, maintenance, or control system changes. 

Wiring

 

 1. Foundation fieldbus segment with the trunk cable shield grounded in all the JBs.

  • Consequences: Ground loops, no reliable communication.
  • What can be done? Follow OEM installation recommendations, quality assurance during installation and commissioning, and qualified technicians.

2. Fieldbuses (e.g., Profibus DP/PA, Foundation Fieldbus) topologies not wired as buses.

  • Consequences: Unexpected behavior, no reliable communication.
  • What can be done? Follow OEM installation recommendations, quality assurance during installation, and commissioning qualified technicians.

3. Foiled twisted pair (FTP) or shielded twisted pair (STP) ethernet cable with plastic connectors at both ends.

  • Consequences: Electro magnetic interference/radio frequency interference (EMI/RFI) impact. No reliable communication
  • What can be done? Follow OEM installation recommendations.

4. Variable frequency drive (VFD) not properly grounded.

  • Consequences: Negative effects caused by pulse width modulation (PWM) to the industrial fieldbuses or highway addressable remote transducer (HART) communication.
  • What can be done? Follow OEM installation recommendations, proper grounding and wiring, quality assurance during installation and commissioning, and qualified technicians.

5. Unsuitable wiring tools.

  • Consequences: Poor wiring termination, MTBF reduced.
  • What can be done? Use the appropriate tool, follow OEM installation recommendations, quality assurance during installation and commissioning, and qualified technicians.

6. Wiring not identified.

  • Consequences: Poor wiring termination. Lack of order and consistency. Long time to reverse engineer or perform troubleshooting.
  • What can be done? Good engineering practices, FAT testing, hardware design specification, quality assurance, etc. 

7. Level switches wired and configured in unsafe way.

  • Consequences: Unsafe wiring practice, and therefore unsafe process condition.
  • What can be done? Always ensure current flow for passive or not activated switch position. For instance, in the high-level switch the wires must be connected to normally close (NC) contact; in that way the current is always circulating, and if the high level is reached or the wire is broken the control system will trigger the alarm and the actions and interlocks needed. On the other hand, in the low-level switch the wires must be connected to normally open (NO), because it is already normally activated by the liquid level, therefore the current is always circulating until the level drops enough or the wire is broken in such a way the current flow will be interrupted and therefore the control system will trigger the alarm and the actions and interlocks needed.

Management/Maintenance

 

 1. Obsolete technical documentation.

  • Consequences: Long time to do troubleshooting and find issues. Lack of foundation for new projects.
  • What can be done? Ensure a correct hand over between engineering and maintenance departments and a clear process in place to maintain the updated documentation.

Commissioning

 

 1. Download of a wrong logic solver during final stage of commissioning.

  • Consequences: Shutdown of the plant. Impact on all commissioning work fronts.
  • What can be done? A good logic solver’s naming convention that clearly differentiates them. Double check before downloading.

 2. Adding a new oil well-head node in a wireless network caused a loop..

  • Consequences: Wireless network went down for some minutes.
  • What can be done? Be careful of creating loops in wired or wireless networks. Make use of available technology and its features to avoid loops, e.g., Spanning Tree Protocol (STP). When necessary, for availability reasons in ring topologies, Rapid Spanning Tree Protocol (RSTP) or Media Redundancy Protocol (MRP) are a good option.