I'm having a problem with FIX 7.0 that I can't figure out. A couple of weeks ago we had a major power outage in the plant that shut everything down. After we got the power back on, I started having problems with a SCADA... dbasrv.exe locks up whenever a remote SCADA tries to write data to the local SCADAs database.
I'll try to explain this and see if anyone else has seen this problem before.
SCADA1 provides process data for display on SCADA2 and also is used to transfer a control bit from SCADA2 to the PLC connected to SCADA1. SCADA2 can read data from SCADA1 all day long, no problems. When SCADA2 tries to write data to SCADA1, dbasrv.exe on SCADA1 locks up and communication between the two is lost. (There is also a SCADA3 that produces the same problems with SCADA1 when it tries to write to SCADA1s database.)
My first impression was that something on SCADA1 got corrupted when it shut down during the power outage, so I took a spare hard drive and restored an image I had saved back in May. It seemed to work fine for about a day, and then the problem came back.
One strange thing I have noticed is that if I start FIX, go into the SCU and turn security off, then turn it back on, everything works fine. It will work until the software is shut down or the PC is rebooted.
These SCADAs communicate via Ethernet over our plant intranet, and they are on different subnets.
I've been working with FIX for a long time, and have seen and solved a number of problems, but this one has me stumped. I tried completely reloading the software from scratch on a new PC, exporting and importing the process database and security configuration, swapping out the key, and directly connecting the two SCADAs via Ethernet using a 4 port switch and some cat5 cable.
This almost seems to me like some sort of network / data packet problem. I did find a white paper on FIX that describes how tcptask.exe communicates with dbasrv.exe via some buffers called NDK/NOH. This seems to tie in to the error message I get:
NDK: n_receive_close() was not called!
Windows gives me the standard error message about a program not being able to read a certain memory location.
Any ideas? I'd sure appreciate any help I can get. I couldn't find out much about this particular problem from the GE web site.
PLC/Electronic Controls Tech