Back in March of 2015, we underwent a DCS conversion from Foxboro I/A to Honeywell's Experion PKS, version R430. We have two servers, the primary and backup, along with a console station and five flex stations--three of which are a quad set of 1920 X 1200 screens, as is the console. The new DCS was implemented on a Kamyr Digester in a pulp mill, from the chip feed system through the washing and stock pumping system, just for context. We have an Allen Bradley PLC which we communicate with over SCADA, so it will remain out of this conversation as it is isolated from the issue.
On startup of the system, analog values linked to the single C300 controller (containing approximately 350 Control Modules) were flashing black on the new displays fairly frequently. As to what was causing the issue, we were unsure. The C300 has plenty of free CPU and it didn't seem overloaded in regards to Honeywell's standards.
After working with Honeywell TAC for a couple of months, it came down to a parameter known as the Total Responder Rate (TRR), which is the average number of Gets/Stores per second to/from the displays and data collection on our single C300. Since some of our displays are fairly dense, the number of parameters being updated on the number of screens with main L2 displays seemed to be the root issue. We investigated the custom shape designs and found that the displays were built as efficiently as they could be in regards to not having excessive parameter calls.
One of the main contributor's towards the high TRR was our console station having a one second update rate, while the flex stations were at five second update rates. The operators had already fussed about how slow the new system was, but by bringing all of the update rates to a three second update rate, the TRR dropped by about 33%, which is where we are today.
We also shut down our data collection into ParcView via OPC for about 20 minutes and we were able to reduce the TRR by another 40%.
Currently, we are riding on average around a 1540 TRR, which the overloaded "limit" according to those we work with is around 2500. We get occasional spikes above 3000 and I've seen it max out up to 7500. The "blackouts" are still periodically occurring, though less frequently--typically in events where operators have high interaction with the displays.
One of our main options is to not collect data directly from the C300 via OPC, but to use Honeywell's Historian (HDA) and then write this to Parcview through, as we are currently already historizing approximately half of our parameters at the "fast" update rate of five seconds. In switching to the HDA, we could effectively cut the TRR size down, but we would want to switch the fast update rate to one second, which ends up increasing the TRR once again.
Originally, our mill had a set of guidelines from which the project team at Honeywell were supposed to follow/meet (i.e. have similar/improved graphics functionality, fast update rates into our site's historian, etc.) for a number of conversions in the mill.
If the Foxboro I/A CP's from the mid-1990's were capable of handling these display densities and the Honeywell team has been aware of this issue prior to our event, wouldn't this be a parameter to design around? Or at least consider it prior to the implementation of a new system?
We've really come down to a limited number of options.
1)Reduce display density by splitting the displays.
2)Limit the number of dense displays that can be called up.
3)Slow down the station update rates.
4)Slow data collection.
5)Optimize balance between ParcView and HDA data collection.
I'm reluctant to implement numbers 1-4 mainly due to the older system being able to be more robust and due to outstanding complaints from operators already claiming it is too slow of a system. Has anybody else ever dealt with this in any form? If so, were you able to find a solution?
We have another identical Kamyr Digester we intend to convert to Experion PKS in April 2016, but this time we are doing it in house. The simplest solution to avoid this problem in the future would be to buy two C300 controllers instead of one, but I just wanted to be sure that this is what we HAVE to do instead of taking the easy way out. I also am aware that peer-to-peer connections plays a part in the TRR, which if we have two C300's, the TRR will be driven higher solely due to the peer-to-peer.
If you have any suggestions or advice, I am more than willing to listen.
Just out of curiosity, what motivated the switch to Experion instead of upgrading your CP30/40/60 to Foxboro's FCP270/280? If you had to replace Solaris workstations and the nodebus, you would have ended up with Windows and the "MESH" - which is fairly similar in terms of network architecture.
> Just out of curiosity, what motivated the switch to Experion instead of
> upgrading your CP30/40/60 to Foxboro's FCP270/280? If you had to replace Solaris
> workstations and the nodebus, you would have ended up with Windows and the
> "MESH" - which is fairly similar in terms of network architecture.
Well, that's a good question that I'm still attempting to find an answer for. Old management were in the process of doing so, our lime recovery system is all MESH, but it seems like all capital was cut when getting ready to convert the other areas (this was before my time).
Our paper machines and power boiler were both on Experion, so I believe that trying to have a single mill-wide DCS instead of three different ones was the rationale. That and it was a lot cheaper than the the other offers. So here we are today, with two out of 7 completed.
Thanks for the info - we automation people are often stuck with a decision based on cost. At least the plant is pushing for a single DCS :)
I asked a more experienced Honeywell friend about your issue, and he didn't have a solution. Sorry.
I don't know exactly how your system is configured, but is there a possibility that you could shuffle some of the work to other stations (or even a new server) instead of having the one server maintain controller communications, the database, and other tasks?
> I don't know exactly how your system is configured, but is there a possibility
> that you could shuffle some of the work to other stations (or even a new server)
> instead of having the one server maintain controller communications, the
> database, and other tasks?
That's something we are currently looking into, but the problem still seems to be present no matter how we shuffle the current load. It may come down to purchasing another controller and reassigning I/O in order to give us the capacity that we originally needed.
It is unfortunate that the new system is getting loaded up and causing graphics to be delayed. I think it is the design of Honeywell Experion software that is the issue. There is nothing that Honeywell TAC support will do about it. There are so many issue and pitfall setting up the EPKS server that it is a nightmare. Even Honeywell TAC support get baffled with the issue that come up.
I think you are double dipping on your historian data. The Experion servers already get alarms, events and history. So instead of loading down the C300 with Expersion server requests and your historian requests, have your historian get the data from the Experian server.
This is what our OSI PI system does.