Crash-proof computer

Michel A. Levesque, ing. · Dec 13, 2000

Hi all,
I read on ZDNET that NASA is going to build a "crash-proof" computer. This is for outer-space work (on the space station). I believe that such a beast already exists in our world of industrial
automation.

My question is:
Can anybody out there tell me where can I find a "crash-proof" computer?

Here are the parameters I believe this beast would need:
-A computer which can run standard and custom applications.
-24/7 ZERO downtime!
-The OS does not really matter but Windows would be the 80% market share approved choice.
-extreme temperature and vibration operating conditions.
-long life (2-5-10 years?)

With all the industrial applications of PLC, DCS, SoftPLC, SCADA, etc. there is surely a computer available now that can cope with these parameters. And would it not be nice to tell NASA not to spend millions just to reinvent the wheel, but that the wheel in question is actually decades old technology.

Michel A. Levesque eng., mcp
Directeur Bureau Montreal
AIA Inc.
[email protected]

darcy oldfield · Dec 13, 2000

This idea of crash proof could consist of not just what it can withstand but how it processes. If for example your were to use a computer as a safety device (not that you can) but if you did the computer would have to be three separate computers. Two computers would perform the work and one would check both of them. The processor of each would have to be different so that if one processor faulted the other would not be stopped by the same error. A good example of this is the Honeywell programable Laser Guard.

If you just want robust which is what the industrial field uses Allen Bradley makes a bunch. Robust doesn't mean crash proof.

Hullsiek, William · Dec 13, 2000

> My question is:
> Can anybody out there tell me where can I find a
> "crash-proof" computer?

Are we talking crash proof or a fault-tolerant computer ???

There are lot of industrial computers (CompactPCI, VME stuff) are used in fault-tolerant architectures.

But then as usual, there are the engineers from NIH (Not-Invented-Here) who dont like COTS (Commercial Off The Shelf) stuff. So they will
invent the $100,000 component, where a $10,000 component may meet the requirements.

Bill Hullsiek
MES Software Engineer
(Open systems bigot)

Slambo · Dec 13, 2000

Well, first off you're going to have to take the whole concept of "Windows" out of the picture and don't even think about running any Microsoft products on it.

My etch-a-sketch never crashes unless I shake it really hard but it's still alot more stable and more powerful than Windows.

I'd suggest UNIX in a mirrored or cluster type of setup.

Alex Pavloff · Dec 14, 2000

Running Windows? Zero downtime? Long life? The ability to run any sort of application? How much do you want to spend on this? You want it now? You can't have it all. NASA knows this. They say "Fast, cheap, good. Pick two."

If you want a machine that will not crash and has no downtime, what you have to do is throw out any current multipurpose operating system, for the main reason that any sort of upgrade or bug fix to the core OS could require a reboot.

We do here at my company a "smart" operator interface panel that allows the system designer to add BASIC code to perform simple to complicated tasks that one normally uses a PC to do (because interfaces just display data). The system runs on top of DOS, and can fit the "long life" and "extreme temparature" qualifications, but is mainly crash-proof because we, Eason
Technology, basically control the entire unit. The designer adds code and display items via our Windows program, yes, but we STILL control what's
going on in the unit.

It's software that's the weak link in the chain here -- not the hardware.

Colin Walker · Dec 14, 2000

I have used a number of off the shelf PCs running MSDOS as MMIs and controllers, most of these have run for years without re-booting. I have yet to come across a GUI operating system that can do this. The operating system and application software is the key issue not hardware.

You can use watchdog timers cards to reboot a halted PC, though this is generally only useful if the PC is performing MMI functions only.

With regard to hardware I have found the BIG name brands to be less reliable, (maybe they have less to lose) than the mid market PC specific brands like Gateway. The ultimate in reliability is Triple Redundancy Fault Tolerance, as used in GE's Speedtronics turbine controllers.

Regards Colin

David Wooden · Dec 14, 2000

Have a look at http://www.marathontechnologies.com . Their system uses four interlinked computers to provide extremely high reliability under Windows.

E. Douglas Jensen · Dec 14, 2000

No such computer hardware and software exists today.

That's why this big consortium of academic, government, and private industry is going to try to solve that. It is an extremely difficult problem, so it will cost a lot of money to figure out how. Go take a look at some of the "almost-crash-proof" systems for flight control in commercial aircraft, weapons control in combat
platforms, telephone systems, etc. There are few
industrial automation systems that can afford the price of even almost-crash-proof computer systems, much less totally-crash-proof ones.

BTW, even open systems bigots discover that when your availability spec for your almost-crash-proof computer requires a lot of hardware fault detection on the VME single board computers you want to use, there are no such COTS products even for the military market, so you have to design your own custom VME SBC. And THEN the really HARD software problems still have to be solved.

Doug

--
E. Douglas Jensen (traveling)
[email protected], http://www.real-time.org
Home voice 508-653-5653, home fax 508-653-3342
Cell phone: voice 508-728-0809; text http://www.bam.com/send2.htm
Pager 888-916-9802 and http://www.skytel.com/paging (PIN 8889169802)
Office voice 781-271-2514, office Fax 781-271-4686

Johan Bengtsson · Dec 14, 2000

Well, I think you are missing one word:
-24/7 guaranteed zero downtime

I think that word is the one making the difference.

I would, for example, not install windows - not any version. I think there is no one that could say: "I am willing to risk my life, and others, on that this computer with windows version xx never crashes". After all, that was not what it was designed for.

The fact that computers running windows exists that is able to run for 10 years and never be stopped or crash is no guarantee. Oh, by the way - most unix versions and other OS are
probably not much better about leaving this
guarantee so this is not intended to be a OS war.

I think every part of the hardware need a very detailed checklist for failure, as well as every part of the OS and every part of all applications running on it.

In an industrial application it may be expensive to have a computer crash, even very expensive, but if someone is risking lives using a computer and installs windows or something similar on it then I think that should be considered a criminal act and result in a looong time in prison.

This is what I think about it. If I where invited to go to space and heard that soemone considered windows crash proof that built the system I would not go if I could choose. - or at least investigate myself at what positions those computers where found.

/Johan Bengtsson

----------------------------------------
P&L, Innovation in training
Box 252, S-281 23 H{ssleholm SWEDEN
Tel: +46 451 49 460, Fax: +46 451 89 833
E-mail: [email protected]
Internet: http://www.pol.se/
----------------------------------------

Seib, Larry · Dec 14, 2000

It would seem that if you took just about any processor and control it with ROM rather than RAM, it would be crash proof, however it certainly would not be general purpose. Even if it
didn't crash, it could put out meaningless garbage with the wrong input. Ofcourse hardware can also fail in which case even this could crash.

Larry Seib.

Edgar F. Hilton · Dec 15, 2000

Avoid moving parts, such as fans, harddrives, etc.

Try getting a PC104 and flash. These have no fans, and the flash replaces the
harddrive. The PC104 should run any operating system that you like, although I
personally am biased towards a hard real time operating system such as RTLinux or
miniRTL.

-Edgar

Edgar F. Hilton FSMLabs, Inc.
voice: 850.893.0300 www.fsmlabs.com
fax: 206.350.4EFH www.rtlinux.com

Curt Wuollet · Dec 15, 2000

Hi Doug

I agree that hi-rel systems present a challenge for us Open Systems bigots. But Open Systems come closer to certifiable, auditable solutions than secret proprietary systems do. There's an FAA certified version of Linux in the works. And one-off custom hardware won't accumulate the kind of hours needed for high confidence. Linux runs on vme bus systems and expect hardware to service the existing hi-rel customers, (telco, high value online services) to creep towards these markets. Today's PC's with the right software are more reliable than the best mil/aerospace systems of not too long ago. Today's SOC offerings are the ones I'd bet on. Redundency in silicon, supervision in firmware, and Open Source Software will decimate the failure mechanisms that limit reliability today, connections and bugs. The smart Ship program would be a success if they
had started with Linux.

Regards

cww

E. Douglas Jensen · Dec 18, 2000

Here is a good link to what this consortium plans to do: http://www.cs.cmu.edu/~jhm/hdcc.htm

Doug
--
E. Douglas Jensen (at home)
[email protected], http://www.real-time.org
Home voice 508-653-5653, home fax 508-653-3342
Cell phone: voice 508-728-0809; text http://www.bam.com/send2.htm
Pager 888-916-9802 and http://www.skytel.com/paging (PIN 8889169802)
Office voice 781-271-2514, office Fax 781-271-4686

Alex Pavloff · Dec 18, 2000

> Here is a good link to what this consortium
> plans to do: http://www.cs.cmu.edu/~jhm/hdcc.htm

Lots of good stuff on that page. They've identified all the problems. The thing that jumped out as me was the following statement:

-----------------------------------
Embedded Systems: Embedded computer systems are arguably both more difficult to make dependable, and more in need of complete dependability. Because they often do not have a human operator acting as a safety net, embedded systems must achieve absolutely bulletproof operation over ears or decades of time.
But, because the actual amount of computational power used is small, such systems are often perceived as easy to build and are often created by engineers or technicians with no formal training in software engineering or critical system design. Whereas desktop computers are built in the tens of millions per year, embedded microcontrollers are produced in the billions--soon to be tens of billions per year. The challenge is how to scale high assurance methods down to the budgets, timelines, and skill sets prevalent in the embedded system world.
-----------------------------------

Now, I've got a different view on this. As a programmer, I see embedded systems as *easier* to design software for. To steal a line from the Outer Limits here, the designer controls both the horizontal AND the vertical.
There's none of these pesky users showing up to punch buttons. <g>However, I have a computer science degree, and have had formal training in software engineering, so perhaps I'm not indicative of "everyone out there." For that matter, I suspect that this list has a greater proportion of the more highly-skilled embedded engineers out there than are actually present in the Real World.

This leads me to my question for you good people that have more experience in this field than I:

Has the need for properly trained software engineers kept pace with the increasing use of software for industrial use?

Curt Wuollet · Dec 19, 2000

Well, they say that defining the problem is half the solution. They don't mention the timeframe and some of the companies that express interest were amusing. It will take a lot of horsepower to achieve these goals in the context of X month
design cycles and slim margins. Still, looking at what the auto industry has accomplished, for example, it's plausible. The thing that most troubles me is that reliability and stability
rank absolutely last on the buying list, as evidenced by what is most commonly in use now. I don't see where the push is going to come from.

Regards

cww

Barry Cooper · Jul 19, 2001

Hi,
I may be a bit late for this discusion but has anyone heard of RISC OS running on ARM hardware?

It is based in ROM and is very reliable.

Unfortunately I am not an expert but go to www.riscstation.com for more info.

Thread starter	Similar threads	Forum	Replies	Date
D	Yokogawa Centum Crash Dump Procedure	Programmable Logic Controller - PLC	0	Jan 18, 2019
P	Siemens HMI Modbus RTU master communication crash	Modbus	1	Aug 14, 2018
R	DCS or DeltaV training or crash course materials for job opportunities	Jobs & Career Advising	1	Dec 14, 2013
H	Mark 5 Hard Disk Drive Crash	Power Generation	4	Apr 28, 2012
M	Recovering from SCADA software crash	General Software Chat	2	Dec 9, 2011

Crash-proof computer

Join our Engineering Community! Sign-in with:

Crash-proof computer

Michel A. Levesque, ing.

darcy oldfield

Hullsiek, William

Slambo

Alex Pavloff

Colin Walker

David Wooden

E. Douglas Jensen

Johan Bengtsson

Seib, Larry

Edgar F. Hilton

Curt Wuollet

E. Douglas Jensen

Alex Pavloff

Curt Wuollet

Barry Cooper

You May Also Like

Two Newcomers in the World of Motion Control From Dunkermotoren and Baumer

Setting up Discrete Sensors: Polarity, Mode, Range, and Limits

AutomationDirect Combines Power and Flexibility in New CODESYS PLCs

Recap: Control Automation Day 2024