So earlier today, one of the three VPRO boards (VPRO <Y>) at my site was found powered off. We discovered this after the following alarms:
* TCP Minor Trouble
* VPRO Communication Fault
* RST Slot-1 VCMI Diagnostic Alarm
Since I was not physically at the site today, I did not get to dig around further. I will be doing that tomorrow and will share the results accordingly. However, since the VPRO <Y> was found powered off, I don't think it would be wrong to assume that the board is done for and that a replacement is in order.
My questions is therefore with regards to how I can go about replacing it safely while my turbine is running.
This is the procedure I found in GEH-6421 Vol. 1.
1. If a VCMI or VPRO has failed, the rack should be powered down,
and the IONet connector unplugged from the board front, leaving the
network still running through the T-fitting.
2. Loosen the top and bottom screws on the VCMI board.
3. Use the upper and lower ejector tabs to disengage the controller from the backplane.
4. Remove the VCMI and replace it with a spare VCMI.
5. Use the upper and lower ejector tabs to install the new VCMI board.
6. Tighten the top and bottom screws to secure the new VCMI or VPRO to the VME rack.
7. Power up the VME rack.
8. From the toolbox Outline View, under item Mark VI I/O, locate the failed rack.
Locate the VCMI, which is usually under the simplex rack, and right-click the VCMI.
9. From the shortcut menu, click Download. The topology downloads into the new board.
10. Cycle power to the rack to establish communication with the controller.
The procedure above says to power down the entire rack. I figured I would just remove the power to VPRO <Y> from the PDM but there is also the question of downloading after replacement. Wouldn't downloading lead to the entire VPRO rack i.e. <X>, <Y> and <Z> rebooting? Also please clarify step number 10, if possible. Why do I need to cycle the power after I have already powered up the rack and downloaded?
Please let me know if it is wise to do online or should I get Operations to take the turbine on FSNL? The latter depends on site load.
Looking forward to your responses. Thanks.
Here's an update, guys. Did some further digging around and found that the fuses on the PDM across J7Y (VPRO <Y> power supply) were blown. After ensuring that this was because of a short-circuit on the VPRO board's PCB, we decided to replace it. To replace, I followed the procedure outlined in my earlier post.
* Turned off power from PDM - toggle switch SW7 for VPRO <Y>.
* Removed the offending VPRO board.
* Installed the new one.
* Powered up the board.
* Carried out download from "Slot 1: VPRO." Selected both Firmware and Parameter.
* Make sure you only select the core you want to download. I selected <S> here which corresponds to <Y>.
* Did all of this while the turbine was operational.
* Rebooted the board in question by cycling power to it.
* It came back online after some time.
So, yes, you can carry out selective download and reboot.
I presume by "powered off" you mean that the board was "dead"--no LEDs were lit or flashing, yet the switch in the <PD> core was in the ON position, AND that the fuse(s) for <Y> were good. If so, then it's still a good idea to try cycling the power to <Y> to see if it re-starts and runs. If not, then, yes, it probably needs to be replaced.
It should be possible to replace the VPRO without shutting down the turbine as long as you unplug the IONET cable from the VPRO properly--not breaking the "loop." Since each VPRO is powered independently from the <PD> and since each VPRO is designed to be entirely independent of the other VPROs and even the control processor it is associated with it should be possible to replace it while the unit is running. (I've never personally done it to a Mark VI, but I have done it several times to a Mark V (replaced the VPRO equivalent, the TCEA)--with no problems. BUT, I was extremely careful to check ALL the Diagnostic Alarms to be sure there was nothing which would cause a problem when doing so, and on a couple of occasions I chose not to attempt doing so because of certain Diagnostic Alarms.
I believe in a TMR Mark VI you can download individually to each VPRO, so that shouldn't be an issue, either. And, most likely to get the download to become effective it will be necessary to cycle the power to the VPRO that was replaced and downloaded to. (Cycling power, or "re-booting," means to open the power supply switch to the card, wait 10-30 seconds, then close the power supply switch to the card to get the information to be loaded into RAM to become effective (active).)
The VCMI is similar--and replacing it in a TMR control panel should also be possible, but it will require powering-down the control processor to remove the failed card and install the new card. You will then have to power-up the control processor and downloading to it when possible, then cycling power to the control processor ("re-booting" the control processor as described above) to get the information to become effective by loading it into RAM.
The thing to be careful of when doing this is to make sure that Diagnostic Alarms from other processors won't "combine" with a card/processor that is powered off to result in a turbine trip. I can't describe every possible condition or scenario, but there ARE certain Diagnostic Alarms which by themselves won't result in a turbine trip while running, but when "combined" with others WILL result in a trip, especially when powering-down processors to replace cards. Just be aware that while the manual (and popular folk-lore) says that it's possible to shut down any single processor to replace a card, that that's written (said) with some qualifications. If there are numerous Diagnostic Alarms from other processors then it's not always possible to shut down one processor/card without tripping the turbine. The presumption is that the other processors/cards are reasonably "healthy" and working properly when powering down a processor to replace a card.
Hope this helps! Looking forward to hearing how your efforts go!