I am working in the Instrumentation Department of a Paper Mill with 3 paper machines. 1 of which is run and monitored by the DCS built on Honeywell Experion 410 PKS using C200 control processors. The pulp mill, paper machine, recovery boiler and power plant are all managed by this system. There has been an issue plaguing us for quite some time that has resulted in occasional breakdowns and huge losses in production.
The main cause of concern is the frequent switchover (as low as 3 to as high as 12 in a day) of the primary controller to the secondary, redundant controller (back and forth) at a frequency much higher than is normally acceptable. Our initial analyses and fault finding pointed at improper grounding practices that have since been completely resolved with the installation of an isolation transformer among other fixes.
Can any of you give your thoughts in general, based on your own experiences or otherwise, that could help us point out the root cause of this issue and which direction we could head in? Any help at all would be greatly appreciated. Also, feel free to ask for more information on specifics which I will provide to the best of my ability.
Thanks in advance and a happy new year to you all!
>Have you also contacted Honeywell TAC?
Thank you for the prompt reply. Yes, we have contacted Honeywell, and their suggestion for us is to upgrade to the C300 controllers. Although we actually started facing this issue since the previous software upgrade that cost us a hefty amount. Hence, upgrading to new controllers does not seem feasible for us at the moment.
What do you think?
Once your DCS has switched to the backup rack it should not switch back automatically. A manual error clear and restart of the failed rack should be required and then a manual switchover.
What is your Experion I/O setup? Type of I/O? I/O interface? Remote I/O? Fibre optic type and converters?
We had a big problem with IOLIM failures with remote HPM I/O using MM/SM fibre converters. The distances were too short, 1-2 km., and the converters were too hot resulting in optical reflections and IO errors.
Optical attenuators were only partially successful.
Solution was to install MM fibre which is what HPM I/O natively supports.
Thanks for the prompt reply. Firstly, I was not aware that the reswitching from the backup rack is supposed to be manual. However this is the setup at the moment and I will bring this up with the Honeywell people in our next meet.
FYI, we have 22 DI, 15 DO, 17 AI and 11 AO cards in total and we plan on increasing those by 1, 1, 4 and 2 cards respectively. Those for the power plant, paper machine and recovery plant are housed in and alongside the server room located at the power plant. Those for the pulp mill are located remotely at the pulp mill server room itself and is communicated via SM FOCs. The failure and switching for us occurs at the LIOM, though one point of difference is that we do not experience any IO errors at the DI/DOs of the pulp mill. Everything works fine there.
I'd also like to inquire whether this has anything to do with temperature variances? Since some of the less informed operators at the power plant seem to leave the server room door open at times, or turn off the ACs for no apparent reason!
Anyway, do let me know your thoughts over the weekend.
Thanks and regards
What type are your I/O cards; Type A or HPM? or a mix?
We found that temperature changes during the day would cause IOLIM faults as would any movement of the SM fibre. So anything that affected the optical signal in the cable was a problem.
What make and model mm/SM FOC are you using?
I was looking for the details you asked for yesterday and I'm afraid I couldn't come up with much. So far as I know, both SM and MM FOCs are being used. MM for remote IOs and SM for the rest. However, I don't know the make and model of either. Also, both LL AI/AOs and HL AI/AOs are being used. I don't know much about the Type A or HPM IOs. Could you expand on that a bit?
As for the temperature changes, they are there, but the weather here is quite cold, especially these days. but I highly doubt there is physical disturbance that is causing movement of the FOCs...
You had also mentioned previously that the switchover from the secondary controller (after failure of primary) back to the primary is supposed to be manual. but I investigated this a bit and can confirm that Honeywell have set this up in such a way because our production is continuous and goes on 24/7/365 (or at least is supposed to). There are no reports getting generated by the system except a time stamp of the switchover time that we can view on a scale back to any date, and I am looking to find a pattern in it somewhere (like peak failure times, frequency of clustered occurrences etc.).
You mentioned the habit of leaving the control room door open. If this was also occurring in the summer time, the higher humidity can tarnish electrical connections, but they are damp enough to conduct. When winter comes, the lower humidity dries the tarnished connectors and causes a lack of connectivity or intermittent behavior. I see this most frequently in low current DC connections using dissimilar metals. Usually cleaning the connectors and/or re-seating modules corrects the problem for a time.
Thank you for your reply. That's a good point you made! I will look into this, and hopefully this will improve things by quite a bit... Do keep giving any further opinions that you may have. All help is greatly appreciated!
Thanks and regards.
The Experion type A I/O is the IO cards that fit into the C200 chassis. The HPM IO is the larger TDC HPM IO that TDC 3000 systems use. See the following Experion IO reference:
C200 will have a Controlnet or ethernet server to controller link.
C200 A type I/O will have a Controlnet link to the controller chassis.
HPM IO will use an C200 IOLIM module to connect to the C200 controller chassis.
For long distance server to C200 controller connections ControlNet to FO converters will be used. C200 Controlnet IO networks use the Controlnet coax cable or for much longer distances the same ControlNet to FO converters will be used.
For long distance remote HPM IO the HPM IOLink supports a MM FOC. The problem arises when additional FO converters are used. Say you already have a SM fibre installation and you want to use it for your remote IO. Then you will also need MM/SM converters at each end.
You need to determine what your network and IO communication topology is. Are you using the C200 A IO for digital and the HPM IO for analogue IO? Because the HPM IO can be redundant?
You would set your C200 controller up for manual fail back so that the system failure that caused the switchover to the backup can be checked, reset and verified good before switching back to the primary chassis manually. Because you run 24/7 does not mean that you want your control to flop back and forth between PRI and SEC. You want to be sure everything is OK before you re-eanble that redundancy.
I've been going through your post and I also understood exactly what you meant from the link you provided. Thanks for that! The HPM IOs you are referring to is exactly what we have, with the IOLIM module housed along the C200 control processor. By the way, we are using Fault Tolerant Ethernet (FTE) not ControlNet. There are of course FOC converters used, mostly for long distance, but also in some cases for short distance transmissions. To make everything much clearer, I have uploaded the architecture diagram of the system for your reference.
I think it will answer most of your questions. It also highlights where FOCs are used, so you will have to determine where FOC converters are used appropriately. FYI, the sections at the top left show the servers of the pulp mill and paper machine respectively.
Hope this helps!
Thanks and regards.
We had a similar problem with fiber optic comms using an FSC/SM due to the I/O comms module being built in Europe, and the fiber optic lines being purchased in the US. The diameter of the fibers is different. if the fiber plugged into the module is not EXACTLY centered, read errors occur, leading to comms failure and switchover to backup.Honeywell was supposed this detail.
As I mentioned above, I could not obtain any concrete information on the make and model of the FOCs being used in our systems and plant in general, except for the type. However, your comment confirms that this could be a critical point in finding the cause of the issue. Thanks for your feedback and do keep giving any more advice!
Thanks and regards.