Mark-V TMR Redundancy Failure - Trips when 1 Computer Reboots

G

Thread Starter

GTENG

Hi!

I really appreciate for any opinion or possibility of solution for my below case on Mark-V TMR Gas Turbine Controller.

Phenomena : Unit trip on gas Fuel (loss of flame) during "R" Computer automatically reboot. while we already found cause of "R" computer automatically reboot but we still investigate fail of redundancy function due to suspected abnormality of "T" Computer.

To confirm TMR function we do reset/reboot on each computer during GT operation. When we do reset on "R" & "S" computer, GT was Trip. but when we do on "T" computer, GT was not Trip. From this investigation we suspect there is abnormality on "T" Computer.

Continue troubleshooting on "T" computer

1. "T" Computer display status "A7" (normal operating condition)
2. Diagnostic Alarm <pre>
Core "C" ==> Voter Mismatch <S> L4ETR1
Core R/S/T ==> TCE1 TMR check trouble , ETR1
Core "S" ==> TCE1 loopback, Relay ETR1
from this diagnostic alarm , direct us of "S/Y" ETR failure
(not match with the reset test result)</pre>
3. Investigate on Trip Log<pre>
L28FDC L28FDD DWATT L4ETR1
11:03:21 AM 1 1 19.5 1
11:03:21 AM 1 1 19.5 1
11:03:21 AM 1 1 19.5 1
11:03:21 AM 1 1 19.5 1
11:03:21 AM 1 1 19.4 1
11:03:21 AM 1 1 19.4 0
11:03:21 AM 1 1 19.4 0
11:03:21 AM 1 1 19.3 0
11:03:22 AM 1 1 18.8 0
11:03:22 AM 1 1 17.7 0
11:03:22 AM 1 1 15.7 0
11:03:22 AM 1 1 13.1 0
11:03:22 AM 1 1 10.2 0
11:03:22 AM 1 1 7 0
11:03:22 AM 1 1 4.1 0
11:03:23 AM 1 1 1.4 0
11:03:23 AM 1 1 -0.8 0
11:03:23 AM 1 1 -2.8 0
11:03:23 AM 1 0 -5.9 0</pre>

From this trip log, it shows that L4ETR1 Feedback change from 1 to 0, 2 sec before GT Trip by Loss of Flame.

4. Verify TCEA to TCTG Cable JL- JLX ,JL-JLY, JL-JLZ ==> No abnormality

5. Function test (Contact output check) TCTG relay K4, K5, K6 ==> Relay & contact are all good

6. Replace TCEA "Z" ==> Not solve

7. Replace TCEA "Y" ==> Not solve

8. Replace TCTG ==> Not solve

9. Replace TCQA ==> Not solve

Thanks for any Help.
Regards
 
Hi!

GTENG,

The overwhelming majority of the time when the turbine trips while testing the redundancy function, or when re-booting processors the problem is caused by incorrect polarity of the current being applied to one of the three coils of one of the servo-valves in operation at the time. There is a fail-safe spring in every servo-valve used on GE-design heavy duty gas turbine, and this spring is <b>always</b> trying to shut off the flow of fuel or air.

Let's presume that the servo-valve current from <T> processor to the GCV servo-valve was incorrect ("backwards"). This means that the current being applied from <T> to it's GCV servo-valve coil was also trying to shut off the flow of fuel to the turbine--along with the fail-safe spring. During operation when all three control processors are supplying current to the servo-valve the currents from <R> and <S> would be combining to overcome the fail-safe spring AND to overcome the incorrect current polarity from <T>--which would be trying to shut off the flow of gas fuel. Further, to try to maintain correct fuel flow, <R> and <S> would also be putting out a little extra servo-valve current to try to keep the GCV at the required position.

Now, when <R> or <S> is re-booted (for whatever reason), then there is only one control processor trying to keep fuel flowing to the turbine--but there is one processor (<T>) trying to stop the flow of fuel to the turbine AND the fail-safe spring is also trying to stop the flow of fuel to the turbine. So, the GCV slams shut, and the alarm "LOSS OF FUEL TRIP" is annunciated--because there was no other trip condition detected which would cause flame to be lost, and flame was lost (because the GCV closed) so "LOSS OF FLAME TRIP" is the correct alarm in this case.

This is why it's <b>SO CRITICAL</b> to verify the polarity of the currents being applied to a servo-valve <b><i>every time a servo-valve is replaced</i></b>. Most sites <b>mistakenly</b> believe that an AutoCalibrate is required when a servo-valve is replaced--but <b>it's <i>NOT</i></b>. And, most sites fail to check servo-valve current polarities when replacing a servo-valve (and instead perform a needless AutoCalibrate), and this is when problems occur when one processor is re-booted, or shut down, when the turbine is operating.

Again--this is not the <b>only</b> cause of tripping when processors are re-booted, but it is the single most common cause.

I'm very suspect of the problem with the PTR/ETR feedback from <S> causing the turbine to trip when <S> is re-booted because those alarms are saying that <S> has already determined the turbine should be tripped, and it's PTR and ETRs are already in the tripped condition (which is to say they are de-energized). So, powering down <S>--which would cause the PTR and ETRs for that processor to de-energize--wouldn't have any effect on the PTR and ETRs for <S> because they are already de-energized. (If you haven't figured out already, the PTR and ETRs must be energized to energize the fuel stop valve solenoid to allow fuel to flow to the turbine. De-energizing the PTRs and/or ETRs of two (or more) processors when the turbine is running will de-energize the fuel stop valve solenoid, thereby tripping the turbine.)

So, there should definitely be other Diagnostic Alarms (usually voting mismatches on CSP logic signals) indicating why <S> thinks the turbine should be tripped. Or you could use the Pre-Vote Data Display, or Rung Display, or the Logic Forcing Display to view L4T and it's permissives for <S> to find out why <S> thinks the turbine should be tripped. There may also be speed pick-up sensor problems for the speed pick-ups connected to <Y> (via the PTBA on <P>), and if this is a retrofit Mark V application that's jumpering the speed pick-up inputs to both <Q> and <P> I would suspect some Process- and Diagnostic Alarms to indicate speed pick-up problems in both <S> and <P> (<Y>).

While SIFT (Software Implemented Fault Tolerance) is designed to prevent most instances of a single processor thinking the turbine should be tripped while it's running, there are some conditions which can result in a single processor thinking the turbine should be tripped--but that is <b>almost ALWAYS</b> accompanied by Diagnostic Alarms, which if investigated and resolved will prevent nuisance tripping.

In thinking about the problem with <S>'s PTR and ETRs being dropped out, it would seem to me that the turbine would trip when either <R> or <T> were powered down--because then two of the PTRs and two sets of ETRs would both be powered-down when <R> and <S> or <S> and <T> were trying to trip the turbine (which is what happens when a processor is powered-down/re-booted when the turbine is running). So, I'm gonna stick with my servo-valve current polarity issue causing the tripping.

It's entirely possible that there is more than one problem--this happens quite frequently when Diagnostic Alarms are not attended to promptly.

These are the most common causes of problems with turbine tripping when processors are re-booted. And, when it always occurs when the same two processors are re-booted it's almost always a servo-valve current polarity problem. Servo-valves are bipolar devices, meaning the polarity of the current being applied to them causes distinctly different actions. For GE-design heavy duty gas turbine, negative servo current INCREASES the flow of fuel or air to the turbine, while positive servo-current DECREASES the flow of fuel or air to the turbine.

Under <b>normal and, basically, ideal</b> circumstances, the amount of current from each processor will be roughly identical. But, depending on the device and its function and the type of feedback for the device and the accuracy of the calibration of the feedback from the devices (I'm referring primarily to P2 pressure feedback(s) for the SRV servo regulator) there can be great differences in servo-valve output currents to a particular servo-valve.

In the case where one processor's servo-valve current is very low or opposite of what it should be for some reason and either of the other two processors were shut down or re-booted then that would cause the servo-valve to shut off the flow of fuel or air to the turbine, thereby tripping it. (There are far too many possibilities to list each one here--<b>BUT</b> if we knew all of the Diagnostic Alarms we might be able to narrow the possibilities down.)

So, these are some of the most likely causes--and some things you can do to try to investigate and resolve them. If you want to know how to check the polarity of the current being applied to three-coil servo-valves of GE-design heavy duty gas turbines--that procedure has been written several times on control.com, and the 'Search' feature cleverly hidden at the far right of the Menu Bar of every single control.com webpage can be used to help search for threads with the procedure. (I suggest you read the Search 'Help' before attempting searches; and I suggest you don't give up if the first search attempt doesn't yield the proper results. Try different terms, use double quotes around multiple-word search terms, etc. The information is definitely there, but sometimes it doesn't always show up in the first or second search--so keep trying.)

Please write back to let us know what you find. Many people read and follow these threads, and many people also read them weeks, and months and years later using the 'Search' feature. Feedback is what makes the threads here at control.com very helpful, so, please, provide feedback to let us know how your resolution is progressing. Even if the information provided was not the exact cause, if it lead you in the direction to find and resolve the cause we want to know! Thanks!
 
GTENG,

Further to the above, the issue with L4ETR1 does concern me. But, something one needs to remember is that when L4 drops out, L4_XTP also goes to a logic "1"--which drops out the ETRs.

You provided some information from the Trip Log, and the time-stamps seem a little odd (in that they don't have the millisecond information). But you did not provide the alarms which occured just prior to the trip, during the trip, and for the few seconds after the trip (which are usually included in the Trip Log information). And, they usually include time stamps to the millisecond.

I also don't quite understand the "shorthand" you used for describing the Diagnostic Alarms--or when the Diagnostic Alarm information was taken. Prior to the trip? Immediately prior to the trip? Milliseconds prior to the trip? After the trip?

The only thing(s) which cause the ETRs to drop out on their own is an overspeed--or a high rate of change of speed feedback (such as from an intermittent speed feedback signal) to the <P> core processors. But, again, that should be accompanied with other Process- and Diagnostic Alarms (because, usually, the CSP is comparing TNH_OS for each processor to TNH1 or TNH and generating Process Alarms when there are differences, in addition to Diagnostic Alarms).

Please, do write back with more information and how your troubleshooting progresses.
 
Hi ..

thanks CSA for your kind help with good recommendation. As you suspect, we found mistake polarity connection in GCV servo valve coil #3 (S computer) causing redundancy problem.

Hope this update could be a reference for others.
 
Top