Communication Failure or No Communication Failure


Thread Starter

Babar Jamil

Dear Fellows: I am working with a SCADA System Integrator. It is a Windows NT, Ethernet/TCP-IP bases SCADA system. I have run into a terminology issue with my contractor. Here is the issue. Any time and due to any reason if the SCADA I/O server is not able to receive any or all of its data from a certain PLC, the SCADA I/O server declares it as a communication failure. SCADA I/O server does not care about the reason for not getting data. The communication with the PLC NIC may be perfectly fine but the I/O server will declare this as a communication failure. I don't agree with this. I think as long as the I/O server can talk to the PLC NIC the communication is perfectly fine and I/O server should not call it as a communication failure. The I/O server, based upon its built in intelligence and to the best of its capacity should try to identify the actual cause of not getting data from the PLC. For, example there was a problem with the PLC, a CRC error, or a short message error etc etc..... This way the system operator will readily know the real cause of not getting data and will not have to tackle the issue of communication failure (switch / router failure etc etc) because communication did not fail. Any comments will be appreciated...


Unless your integrator actually wrote the code for the I/O server, they will be limited to providing only the functionality that the software itself will provide. Meaning, if the I/O server does not have the *capability* to tell you why it failed, then you can yell at the integrator till you are blue in the face, and at the end of the day, you will still only get a 'communication failure'. It probably is not an issue with the integrator, but with the developer of the software. You can contact them to try to get this feature added, but don't count on any of them responding to a lowly end user request. (Sarcasm intended). On the other hand, if the I/O server is capable of detailing the reason for failure, then it would seem beneficial to the integrator to break this out for the operator so that if the integrator gets called for service work, they can better ascertain the cause of the problem. Just my $.02 --Joe Jansen

Darold Woodward

Are you really going to present highly detailed alarm information to a process control person? This is not for the convenience of the process control person, but for the guy that has to fix it. It seems to me like any error is an error that the person at the SCADA terminal needs to see as an error. You also need to consider conditioning the alarm so that transient conditions are not reported as continuing alarms, but do rise to the surface so that they can be addressed. I don't know anybody responsible for keeping a process running that knows or cares what a CRC is. Perhaps you need to provide an alternate screen that displays more robust diagnostics. When the technician arrives to troubleshoot, the first thing he asks the process person to do is show him the diagnostic screen so that he knows to attack the problem. If you do what I've seen (and is a real disaster), you will swamp the process people with hundreds of errors with cryptic messages that don't help him run the process. Here's an example: In the first two minutes of the event at Three Mile Island, there were over 200 alarms, bells, etc. Would you be able to accomplish anything effective with all of that data? No, you must design the system taking into consideration what information is needed and what should be stored for someone else, and how to process data into information. So, in my opinion, if the data on the screen is suspect, then there is a failure. You can call it anything you want on the screen. Communications failure is just one choice and not an uncommon one. You can pick any terminology you want provided that your operators understand it. Maybe you should consider changing the display color of data points that aren't being collected so signal that it is old data or perhaps corrupt and skip the entire label issue. Darold Woodward PE SEL Inc. [email protected]

Robert Smyth

It seems that your defining communications as being the physical and data link layers only. That is, your saying if the data link layer can talk to the device then communications is okay. But, when telemetering data we are concerned with more than just these bottom two layers. I see them as a prerequisite rather than satisfying condition for communications. If we have an application timeout. That is, it takes too long to read all the data. Then I feel this is a communications failure even if the data link layer is still connected. My reasoning is that in a control (telemetering) application we are concerned in if the communications is satisfying the application requirements. If it is not then the communications has failed to do its job. But ... I agree that fault diagnostics are always important. So what is causing it to report the fault? Robert Smyth EngWare Pty Ltd ? Voice: 0411 202 514 ? Fax: 02 9222 1457 ? E-mail: [email protected]

Gerard Leemkuil

One suggestion. When you are using NT4 install , install service pack 6 and 6A. Some internal timers are running better. it resolves problems with Siemens OPC drivers for connection to RS View. For NT5 (windows 2000) install service pack 1. Its prevents some ip routings problems. Succes Gerard Leemkuil

Conrad van Rooyen

If you are using Citect for Windows you can do what you are asking. In Citect the IO Servers and related drivers are built into the SCADA sofwtare (no separate software for an IO Server). When the communication problem is detected, an error code for that particular protocol/medium been used is issued and displayed on all clients hardware alarm page. ie. CRC error, Bad or Scrambled Response, No response, Timeout etc. This is a standard feature. The protocols are also configurable so you can set the timeouts, retrys and various other protocol settings. If you would like more information please contact me off-list at [email protected]