MK5 Bad EEPROM Checksum

C

Thread Starter

Charles

I have been troubleshooting a recurring problem on MK5 <T> core.

The HMI shows diagnostic “Bad EEPROM Checksum” alarms. MK5 <T> core also shows “Bad EEPROM Checksum.” A review of <T> core IO board status shows <01> TCQA A-4. All other boards are ****FF (loss of communications).

When <T> is rebooted from PD, the core boots to only A4. I performed EEPROM DOWN T4 T USER, then rebooted <T> core from PD. This action brings <T> core back to A7 status. All alarms clear.

However, the problem keeps returning after a few days and I have to do EEPROM download to clear the problem.

Is it possible that the DCC PROM is getting weak and not able to maintain data configuration?

Should I replace <T> core DCC PROM or troubleshoot other areas?
 
Unfortunately we are facing similar problem with <R> core.
We actually don't know what is causing the problem but hope anyone may have some sort of solution for that problem. Any specific <R> card shall be replaced?

Regards,
 
Charles and indurai,

Yes; the EEPROM chip on the DCC/SDCC card (usually identified as being in the U9 chip socket, I believe) does occasionally go bad. They are a fairly generic EEPROM, though I don't have the manufacturer's name or part number. The EEPROM chip does not have a white "tigerhide" label on it; it's just a plain, large, socketed chip.

If replacing the EEPROM chip does not resolve the problem, then I would suggest replacing the entire DCC/SDCC card. The Mark V runs off information copied from the EEPROM chip to RAM during initialization (boot-up) and only writes to EEPROM when updating the timer and counter values. About the only time I can think of when a processor accesses EEPROM other than during initialization (boot-up) may be when using the Control Constant Adjust Display, some of which actually display the EEPROM values of Control Constants. So, it's kind of interesting that, apparently, sometime during normal running operation the main microprocessor seems to "lose it's memory"--and I'm not sure if it's losing it's RAM or the EEPROM. I'm not sure if downloading to the EEPROM, followed by re-booting the processor, or just re-booting the processor actually "populates" the RAM so the main microprocessor can operate properly.

Unfortunately, the OS (Operating System) used in the Mark V control panel by the processors (NOT IDOS) is not documented very well. All we know for sure is that the Mark V processors run off information in RAM, and during initialization that information is copied from EEPROM to RAM, and then the only time (I'm aware of) that the processor communicates with EEPROM after initialization is when writing timer and counter values to EEPROM (which it does approximately once per hour) or when some versions of the Control Contant Adjust display access EEPROM to check the values of Control Constants.

So, I'm kind of inclined to think the problem may be more with the RAM (volatile memory) than with EEPROM (non-volatile memory). But, that's just a guess.

I would first try replacing the EEPROM chip, formatting the new EEPROM chip, re-booting the processor (it will only go to A4/A5), then downloading ALL (which, yes--includes another FORMAT command) and when finished, re-booting the processor again. At which time it should go to A7. If the processor goes to A7 during downloading, it's a "false" A7. If the processor goes to A7 during downloading, after downloading is complete the processor should still be re-booted to get a "true" A7.

And, no; I don't know which chip(s) on the DCC/SDCC are the RAM chips.

I do have two questions for the both of you: Do your sites experience lightning strikes/electrical storms?

And, do the units experience a lot of 125 VDC battery ground alarms?

Please write back to let us know how you resolve the problem(s).
 
CSA,

Wow it's been awhile since I made that MK5 Bad EEPROM Checksum posting. I don't remember all the details. I must have solved the problem quickly because I did not make a personal maintenance log entry on the problem.

But I recall replacing the DCC/SDCC solved the problem. It is interesting that you asked about "lighting strikes". Our plant is like home of the lighting strikes. They come here to roost.

We do get 125VDC grounds occasionally, though I have not had any this year... I hate TS grounds. I have followed your older previous postings on TS grounds which helped me in my efforts greatly.

Wish I could give you more; but for some reason I did not record my actions... thing's probably got busy. You know how it goes.

Charles
 
Charles,

Thanks for the feedback!

I don't know how I missed your original post; sorry about that.

Hopefully indurai will find your feedback helpful.

And, it's interesting to learn that your site experiences a lot of lightning strikes.... Hmmm....

Anything else you can tell us about the after-effects of the lightning strikes (Diagnostic Alarms; turbine tripping; etc.) would be greatly appreciated.

Do you think you have replaced an unusually high number of DCC/SDCC cards? Say, more than one every couple of years? Were there other Diagnostic Alarms annunciated, other than checksum failure, that prompted you to replace the cards?

In thinking about this, do you think there is any correlation between the high number of lightning strikes and the failure rate of DCC/SDCC cards? Or any other Mark V cards?

Thanks for any additional information you can provide!
 
CSA,
In my last posting I said there had been no lighting strikes this year; well I need to correct myself. We did have one in March which came to mind after posting.

As a result of that strike, I had to replace a CT1 CTBA card, CO2 Fire System Controller card and a Security Camera control card.

Damage from a lighting strike is not always obvious, especially if the units are offline.
The first thing I check for is alarms and diagnostic alarms. (Hopefully there are no ground indications) Then I check voltages on Diagnostic Counters i.e. TCQA, TCE1 power supply.

The second thing I do is reboot all my unit MK5 cores and see if they return to A7 status. If, I don’t get A7, I begin checking the I/O card status. Problems in the past have been found with the TCQA, TCD1 and TCE1.

Last thing I do check all 4-20ma instrumentation for bad readings. For example on this last strike the turbine evaporator cooler sump level transmitter would not read more than 22% even after transmitter calibration with a 100% sump level. Measuring a 20ma input at the CTBA-screw 70 (WLAC) from the Transmitter proved that the card was receiving proper current, but the MK5 showed only 22%. Replacing the CTBA solved the problem.

Experience over the years has taught me to keep two of each MK5 card on hand and ribbon cables. I hope this little information helps others in their TS efforts.

Charles
 
Charles,

Yes; thanks for the clarification. (I understood your last post to read you hadn't had any 125 VDC battery grounds this year.)

Anyway, this should be helpful for other when TS (TroubleShooting) similar issues.

Continue to stay away from those stray lightning bolts!
 
Top