Automation networking demands resilience

Paul Boughton

Engineering professionals in disparate industries all need to integrate increasingly complex systems while at the same time ensuring reliability. Boris Sedacca asks engineers how to achieve this.

Increasing complexity in automated systems adds new points of failure. Engineers must ensure that individual component failure does not bring entire systems to a halt. There have been dramatic developments in supervisory control and data acquisition (SCADA), fieldbus and resilient computing which have transformed the industrial landscape over the last couple of decades.

HMI, PLC and servo technology have all been affected by fixed and wireless communications. Fault-tolerant servers using dual- or multiple-redundant industrial networks need to be connected to plug and play hardware components, all controlled by resilient SCADA software systems. However engineers all have different views of the best way to achieve this. Cost is a particularly relevant factor.

Michael Kohli, president of Kohltek, says: "I had a customer who works for a major robotic manufacturer and when one of the engineers said he needed redundancy, my customer reminded him they were not willing to pay for it. Apart from cost issues, hardware needs to be immune to interference and fade. It could mean using Spread Spectrum Hopping radios instead of Wi-Fi, RS-485 instead of RS-232, or fibre instead of wire."

Kohli contends that the proper use of combining networks such as Rockwell's Common Industrial Protocol (CIP) or using switches with Internet Group Management Protocol (IGMP) snooping to control traffic can ensure proper bandwidth allocation.[Page Break]

Sensor voting

"The easiest way to provide redundancy is through sensor voting, where three sensor are used and the two that agree win," adds Kohli. "The use of 1+n redundancy is standard in most high reliability systems, but care needs to be taken to insure that repair can be made with taking the system down such as the use of hot swappable products."

Resilience in avionics software is as taxing as it can get for engineers. Rich Merritt, proprietor of M3 Marketing Communications, cites the case of an avionics company that abandoned Matlab in favour of C to develop resilient software. He says: "Matlab has three major issues: firstly it takes about ten times as much memory as an equivalent C program, secondly nobody understands what it is doing and thirdly it is almost impossible to test.

"On one airplane, a staff of about 50 programmers worked for two years and could not get it to work. So they called in a consultant and he rewrote the whole thing in C in six months - and it works. On another airplane, the thoroughly tested, simulated, and FAA-approved software went completely dead on the first flight. The test plane still had manual controls, so they could bring it home safely.

"When I worked for NASA, we had to test everything. Commercial jets continue to use ancient avionics and software developed in the 1980s, because they do not trust the new stuff or today's crop of programmers. Let us just hope that avionics technology does not work its way over to automation and process control."

Brian Chapman, SCADA software engineer at Schneider Electric, believes that centralised control is always more reliable than distributed control. "I do not believe that airliners or spaceships use distributed control. They use redundant central control. If anything goes wrong, the entire control system is replaced. As far as software validation is concerned, this is the best method.

"All the communications between controllers is a huge source of unpredictable errors that you do not want to find out about 40,000 feet in the air. Distributed control does not really make sense. If a controller that actuates the rudder loses communications with the cockpit, just what kind of decision can it make? There is none. You lose control of that rudder and everyone dies.

"Distributed control systems are not more reliable than distributed control systems; just the opposite. Distributed control systems are just cheaper. The hardware is cheaper. A PLC with 100 I/Os is more expensive than 4 PLCs with 25 I/Os each. Networking is cheaper. Add up the feet of wire and in most cases, there will be much less of it in a DCS. If one PLC dies, you only have to replace that one, rather than the entire system."

Chapman responded to Merritt's comments, saying VXWorks is C code. "If they wrote a real time system in C, that is probably what they were using," he said. "It is really a neat system. It creates very small programs. It even re-compiles the operating system (Unix/Linux) to optimise and minimise it for running just that program.

"Try this in Windows. Go ahead and try to wipe FreeCell off your SCADA system. You can hide it, but you cannot get rid of it. I have never had the pleasure (or pain) of using Matlab for writing real time control software. I think what it is really good for is modelling systems, and developing control algorithms from the simulated models. But once you have decided on the control method, it should be hard coded in a real-time operating system in my personal opinion."

Dick Caro, computer networking consultant and owner of CMC Associates, argues that wired networks rarely suffer from failure of the network components, but can fail due to mechanical problems like the proverbial forklift truck that tears up the wiring. Wireless networks, particularly mesh networks such as ISA100 and Wireless HART have an in-built resilience path.[Page Break]

User experience

Caro says: "User experience has proven the reliability of Foundation Fieldbus networks, where most failures occur in the I/O cards, not the network. Foundation Fieldbus HSE has built-in resilience if you install parallel Ethernet signal paths, one of which can be Wi-Fi, but HSE is rarely used because most DCS suppliers do not support it. I advise my clients to be highly selective in installing resilient networks to only those situations where failure is likely."

Kohli maintains that Ethernet IP is not a good candidate for robustness or resilience. Ethernet IP can be broken too easily, unless engineers take extreme precautions in designing and maintaining the network. They need to use industrial managed switches, and routers with special recovery protocols, not just spanning tree.

"With normal switches, recovery from a ring failure can take as much a two minutes in large networks," assert Kohli. "Most industrial switch companies use custom algorithms that can take as less than 100ms for the same network. This is a big deal when that vat of molten steel is starting to pour.

"In addition to using ring networks, the higher reliability networks need dual paths, which in a wire system need a dual homing switch. This equipment can be just as expensive as the dedicated buses. Ethernet networks are also sensitive to loads. While you may have +250 addresses available, most will not support nearly that many, and then you need to be concerned with things like vision systems and other peripherals that may be connected requiring separate networks or load balancing.

"There is a third problem that all Ethernet networks face, like Modbus TCP/IP, ProfiNet, CCLink over Ethernet, and so on: that is that anyone, from the guy with the laptop to the corporate IT Manager, can connect to your network with the right knowledge or permission, and break it. This requires additional safety measures that the engineer may not have due to company policies.

"Most systems allow things like MAC filtering or static IP address, but often those are controlled by the IT Department. A better way is to use networks like Rockwell Automation ControlNet, which can handle the faults and still talk to the Ethernet network via CIP. I think it's great that we can use inexpensive networks, but remember they are inexpensive for a reason, and features cost."[Page Break]

Redundant fieldbus expensive

Merritt retorts that the problem with fieldbus networks is that they are so expensive and redundant fieldbus segments are even more so. He believes that is why Ethernet IP will be replacing fieldbus in the not-so-distant future, citing Rockwell Automation, Endress+Hauser and other vendors who are using Ethernet IP for process instrumentation.

Merritt adds: "Sure, fieldbus cuts down on the number of wires needed for wiring 4-20mA instrumentation, but you still have all those expensive device couplers, fieldbus I/O cards, and software from six big vendors to deal with. Then you have the practical limitation of only 12 devices a segment. Fieldbus was an ultra-expensive solution to the wiring problems of 4-20mA systems. Ethernet IP, HART, and wireless will replace fieldbus one of these days."

Caro rejoins: "There are no Ethernet IP process variable transmitters yet, even from Endress+Hauser. They have not publicly announced any either. There are intrinsic safety and distance problems in taking Ethernet wiring to the field. Wireless can solve those problems, but Wi-Fi (IEEE 802.11) is not a low power technology yet, and there is no meshing protocol for it."

Before the days of desktop PCs, companies provided their users with 'dumb terminals' connected to mainframe computers, while on the shop floor, equipment was controlled by relays, microcontrollers or dedicated industrial minicomputers. Today, companies want integrated control from the factory sensor/actuator to the boardroom and thin client has emerged as the technology of choice for companies around the world.

Thin client is almost a throw back to the days of dumb terminals. However, instead of having a large mainframe slicing up its processor time among a number of terminals, a central server containing several rack-mounted processor 'blades' apportions hardware and software resources via communications links to desktop users. The latter consists of thin client PCs containing limited hardware resources like disk storage and memory, with the bulk of the software running on the server.[Page Break]

Fat clients

Rob Dinsmore, of HardwarePT, explains: "At present, businesses have to cope with increasingly powerful desktop PCs (fat clients) and the attendant problems of support, maintenance and security being spread throughout an organisation - which may be geographically dispersed across several locations or even several countries.

"The concentration of resources at the client end opens up the possibility of vulnerabilities should the hardware fail or be attacked deliberately. Thin client has revolutionised corporate IT and is now set to revolutionise industrial automation and processes. It cuts the amount of hardware needed at the operator's end - the client end - and consumes less power."

The more bespoke applications that run on client machines, the more difficult it is to get the client back up and running again in the event of failure, particularly if an engineer then has to go to the client location to fix the problem. If these applications run on a server instead, the measures would typically be in place to automatically restore the client session onto another blade processor with minimum disruption.

Recent Issues