Culture Change at NASA

According to the creation myth, in the beginning, NASA was full of young, cocky, innovative, hard charging folks who got us to the Gene Kranz's School for Young Gentlemen circa 1967moon inside a decade.  They were brash, confident, and did not suffer fools gladly.  If they were worried, they didn’t show it.  Stories abound of 100+ hour work weeks end to end, almost impossible to believe.  Their theme -as posted on the factory walls – was ‘waste anything but time’.  Going to the moon was the cliché for doing the impossible and they were going to be the ones to do it.  They were the epitome of risk-taking, innovative, creative, flexible, nimble, achievers.

 

 

 

 

 

 

 

On the way to the moon, the Apollo 1 fire happened.  It was a tragedy.  It was beyond awful.  With 20-20 hindsight, the root cause of the fire was obviously sheer stupidity.  There were investigations and panels and recommendations.  As in every accident investigation, the investigation board found that communications between people and organizations were faulty.  Management culture was poor.  And the safety organization was strangely silent on dangerous situations which they had been warned about.  So the recommendations, in additional to technical things, included improving communications, changing management culture, and reinvigorating the safety organization.  And even though everybody at NASA believed the fire was a one-time thing, NASA tried to improve.  Some bureaucratic checks took a little of the nimbleness out of the system in the name of safety, but mostly NASA got a pass because we had to beat the Russians.  The Eagle landed, the mission was accomplished, and time passed.

One the way to exploiting the space frontier with our new space shuttle, 19 years and one day after the Apollo 1 fire, the Challenger and her crew were lost during launch.  It was a tragedy.  It was beyond awful.  With 20-20 hindsight, the root cause of the accident was obviously sheer stupidity.  There were investigations and panels and recommendations.  As in every accident investigation, the investigation board found that communications between people and organizations were faulty.  Management culture was poor.  And the safety organization was strangely silent on dangerous situations which they had been warned about.  So the recommendations, in addition to technical things, included improving communications, changing management culture, and reinvigorating the safety organization.  And even though everybody believed that the accident was a one-time thing, NASA tried to improve.  More methods to communicate were added, more bureaucratic checks were added, the system slowed down and became more costly in the name of safety, but mostly NASA got a pass because we still had to beat the Russians, this time to build a permanent space station, and they were ahead of us.  The Hubble was launched, the assembly of the Space Station started, and time passed.

17 years and three days after the loss of Challenger, Columbia disintegrated during reentry and her crew was lost.  With 20-20 hindsight, the root cause of the accident was obviously sheer stupidity.  There were investigations and panels and recommendations.  As in every accident investigation, the investigation board found that communications between people and organizations were faulty.  Management culture was poor.  And the safety organization was strangely silent on dangerous situations which they had been warned about.  So the recommendations, in addition to technical things, included improving communications, changing management culture, and reinvigorating the safety organization. 

This time, nobody inside or outside of NASA believed that the Columbia accident was a one-time thing.  So we tried to change the very root culture at NASA.  Strangely, I found myself at the epicenter of the culture change; one of the least likely managers ever to participate in touchy-feely human relations changes.  We got trained by professional councilors on how to play nice and communicate affirmingly. At the end of seven years, some change is evident.  Safety is reinvigorated; the management culture has bent toward more safety; and  communications, well, need more work and probably always will.  Dissenters must be heard and understood, and mostly placated; much more bureaucracy has been added in the name of safety, and everybody now has a “stop work” card to play if they have a concern.  NASA did not get a pass, the Russians are no longer our competition but our partners, and the debate intensifies as to whether America should send humans into space.  Meanwhile, the Space Station has nearly been completed, the shuttle is about to be retired, its mission accomplished, and time has passed.

Now conventional wisdom says NASA is risk averse.  Afraid of failure, afraid to take risks, requiring draconian and expensive safety insight for even mundane tasks.  They say that NASA depends too much on extensive testing and expensive analysis to prove that every operation is as safe as humanly possible before undertaking it.  That is the conventional wisdom proffered by the media, the pundits, and those who want to be in the space business. To be successful in space, we hear, risks must be taken, fear must not inhibit innovation.  The possibility of failure must be deeply discounted and the consequences of failure should not be contemplated very hard lest we waiver from our goals.  We need organizations that are nimble, flexible, innovative, and risk taking to be successful in space. 

In short, NASA should turn to private enterprise for a ride to space.

So how can a staid, grey, old, inflexible bureaucracy approve flying its people on somebody else’s rocket?  Experience has been a hard teacher; everybody at NASA has been instilled with a great personal responsibility for safety; the knowledge that if the widget that they are responsible to monitor causes failure it will be their own personal fault.  Do you untrain the culture of the last seven, no –  forty, years as drilled into every NASA engineer and manager?  Probably not.  But if American astronauts are to ride to the international space station on a rocketship that NASA did not build, there will have to be a tectonic shift in NASA culture.  Regardless of who builds the ship or operates it or what shape it takes, one thing is certain; NASA’s role will have be different.  That will take a tremendous amount of energy, and time must pass.

In the middle of the last culture change I sent the following paragraph to the shuttle troops.  I still stand by it and it rings strangely true for the future, too.

Life is full of gray choices.  Deciding the work completed is good enough because more will not make it perfect.  Ten thousand gray choices; doing what we must do, and not a bit more because that would take away from other work that is absolutely critical to be done right.  When we have done what we can do, when we have driven the risk to the lowest practical level where it can be driven, then we have to accept the fact that it is time to make a decision and move on.  Because history is waiting for us.  But history will not wait forever, and it will judge us mercilessly if we fail to face tough choices and move ahead.

Thoughts on Commercial Human Orbital Spaceflight

Shortly after I moved into the Shuttle Program office, I was very surprised to learn that NASA did not own the blueprints for the space shuttle!  The government never purchased the intellectual property and the design details of the vehicle.  The blueprints are all proprietary information belonging to Boeing.

 

NASA never really built any big rockets; NASA hires contractors to do that.  For example, the Saturn V was built in pieces, the mighty first stage by Chrysler (how times have changed!), the second stage by North American Aviation, the third stage by McDonnell Douglas, the lunar module by Grumman, and the command/service module by North American. 

 

North American Aviation was an innovative, nimble, flexible, efficient, small commercial aircraft company lead by the legendary “Dutch” Kindelberger.  NAA designed and built many classic aircraft including the P-51 Mustang.  After Kindelberger passed, corporate mergers changed NAA to North American Rockwell, then Rockwell International (which can claim credit as the designer and producer of the Space Shuttle orbiter), and now to merely a division of the Boeing corporation.  The historic site in Downey which saw production of the P-51 Mustang, the Apollo CSM, much of the Shuttle Orbiter was sold, sadly, to commercial interests who couldn’t turn a profit on the land as a strip mall but rent the property out for movie making.  Sic Transit Gloria Mundi.  After Boeing bought out RI, the workforce moved a few miles over to Huntington Beach.  It’s just business, as they say.

 

So I am quite amused by the current debate about whether or not NASA should build rockets or contract that work out to commercial firms.  NASA per se has never built rockets of any size.  But that statement is so simplistic as to be disingenuous.  There is a marked difference between the “old” way of doing business and what is being proposed as a “new” way of doing space business.

 

Simply put, in the old days (or even today’s days), NASA (the government) was in control; made all the big decisions, required complete insight into all the details of the design, manufacturing, testing, and production of the space flight vehicle.  Eye watering amounts of documentation were required for every step.  The contractor might do the detailed work, but the government folks got to see everything, review everything, and approve everything.  The contractors work on a “cost plus” basis and charged for every change.  Somewhere along the line, the small, nimble, flexible, innovative, efficient company that was North American Aviation became a cog in a bureaucratic, military-industrial, giant corporation (no offence, Boeing). 

 

The “new space” model is that one or more nimble, flexible, innovative, efficient commercial companies will provide a reliable, safe, economical launch vehicles and spacecraft that American astronauts can ride to the International Space Station.  Getting to low earth orbit is so easy that practically anybody can do it!  Large government programs are no longer required and NASA should concentrate its efforts on deep space exploration and doing the “hard” things like landing on the Moon or Mars.

 

Except that in the early part of the 21st century, getting to low earth orbit is neither routine nor easy.  Anybody that has really tried to do it – past the viewgraph engineering stage – can attest that getting to LEO is hard.  It requires precision, care, extremely good engineering, quality control, etc., etc., etc.  Landing on the moon may be “hard”, but getting to LEO and back is hardly a cakewalk.  Recently I have read several statements from some “new space” entrepreneurs concerning space flight safety.  They acknowledge that an accident would be devastating for the commercial crew launch business, so they profess that each of the companies attempting to put human spacecraft in orbit (or sub-orbit) is committed to safety.  I believe that statement.  However, intentions are not enough; remember whither the road leads which is paved with good intentions.  In my mind, I can hear entrepreneurial mortgage lenders claiming giving loans to people who cannot repay those loans is bad for business and could cause the mortgage company to fail.  Surely nobody would do that, right?  There are pressures to compromise safety everywhere and to think that a commercial business won’t be subject to those pressures is naive.  How do you know when you have gone from being “efficient” to having cut the corner too close?

 

I do believe that commercial human space flight can be accomplished much more economically and efficiently than the government and our “cost plus” contractors do it today.  And it can be done with a reasonable level of safety, even in this low margin, high energy, dangerous business.  But how to accomplish these competing goals is the question. 

 

It is entirely one thing for a wealthy adventurer to personally choose to go into space on a new and untried rocket.  After all, nobody stops you from climbing Mt. Everest or parachuting into the wild outback for a ski adventure on a pristine mountain, its your own skin, your own risk.  But if the goal is to put U. S. Government civilian employees who are on official U.S. Government business on a commercial rocket, it will be the responsibility of some government agency (NASA?  FAA?) to ensure that the “conveyance” is reasonably safe.  NASA knows only one way to attempt to ensure safety, and that is very invasive.  In this case, synonyms for ‘invasive’ include:  costly, slow, bureaucratic.  Won’t help the business to be nimble.

 

In the 1990’s, NASA turned over the management of the space shuttle subsystems to the Boeing contractor.  In effect NASA relinquished a modicum of control and insight, a huge change in NASA culture at the time.  Going to a commercial launch vehicle will require a bigger change NASA culture.  This level of culture change is not impossible, but it is hard.  We’re currently studying on how to make commercial human space flight work – safe and economical at the same time.  As always, the devil is in the details.  And the hardest part will be the culture change.  Changing NASA’s culture is a topic for another day.

Sine Qua Non

I have been pondering the Augustine report (at least the executive summary) which has been released.  There are a couple of sentences up front that have been on my mind:

 

“Human safety can never be absolutely assured, but throughout this report, it is treated as a sine qua non.  It is not discussed in extensive detail because any concepts falling short in human safety have simply been eliminated from consideration.”  As panel members commented (more than once) during the public sessions, ‘we assume NASA will build safe systems’.

 

I’m not a Latin scholar so I had to look it up.  Sine qua non means the something or someone indispensible.    So safety is indispensible.  I’d agree with that.  As a matter of fact, I have spent my entire career based on making spaceflight as safe as possible while still actually flying. 

 

Actually, the assumption that NASA will build safe systems is poorly demonstrated by our history.  Our failures are painful to enumerate.  Early after the Columbia accident, we engaged Dr. Charles Perrow of Yale University to talk to us about his book (and theory) titled “Normal Accidents”.  In summary, Dr. Perrow believes that accidents are unavoidable in complex systems.  Very depressing to read.  Nothing you can do will ultimately prevent a fatal flaw from surfacing and causing catastrophe.  Life is hard and then you die.  Not very motivational, but perhaps true.  So all of us who listened to Dr. Perrow determined to prove him wrong.

 

In any event, safety in space flight is a relative term.  A launch vehicle with a 98% success record is considered very safe, but you would never put your children on a school bus that only had a 98% chance of getting them safely to school.  It is a high risk, low safety margin endeavor.  Probabilistic Risk Analysis has made great strides in recent years but the only statistic I put any faith in is the demonstrated one.  The shuttle has failed 2 times in 125 flights.  That is not good enough.

 

Six years after the loss of Columbia, I’m not sure that we can make a spacecraft safe, but I have empirical evidence that proves beyond a shadow of a doubt that we can make it expensive.  The cynical part of me says that is what we do at NASA: demand extraordinary proof that things are safe.  ‘Proof’ means a series of tests -a large enough number of tests to be ‘statistically significant’- and/or very complex analysis which examines every facet of each part of a system in detail to demonstrate that in the worst possible set of circumstances the system will perform as required.  Trouble is, there is no end to imaginative tests, and there is always something else to throw into the analysis.  And it all must be extensively peer reviewed, debated at length, documented to the nth degree, briefed to multiple layers of management, and signed off by virtually everybody in

the organization.

 

This is a very expensive process.

 

History indicates that attention to safety doesn’t seem to last.  Sooner or later the people charged with making a system safe retire or die off, the bean counters get their knives out and the organization gets trimmed in the name of efficiency and cost savings, and somewhere along the way an invisible line is crossed.   And Dr. Perrow is proved right again. 

 

Not to be too depressed, but these report’s two sentences on safety are counterbalanced by many more sentences describing how space systems must be made cheaper and should accomplish its goals sooner.  ‘Faster, better, cheaper’ was the rallying cry of management over a decade ago.  The wags soon added ‘pick any two’.  My experience has been that a project manager is lucky to get two, and many projects end with having failed on all three counts.

 

I found another Latin phrase which may apply here, from Horace:  Splendide mendax.  It means ‘splendidly untrue’.  Safety at low cost, that is. 

 

So as we look to the future, it is going to take a great deal of careful management to ensure that commercially provided crew transportation systems are adequately safe and yet not drive the cost (and schedule) through the roof.  This balance is not easy to accomplish.  Careful and thoughtful management attention will be required.  No doubt you will hear some debate about this topic in days to come.

 

Which brings me back to sine qua non.  About a year after the loss of Columbia, NASA had a conference on risk and exploration.  A number of folks who do dangerous exploratory work talked with the NASA leadership about these issues.  Probably the most memorable thought of the whole conference came from James Cameron.  After almost two days of people repeating the phrase “safety first, safety is the most important thing”, Mr. Cameron made this observation:  “While safety is very important and must be considered at all times, in exploration safety is not actually the most important thing.  In exploration, the most important thing is to go.”

 

If I were writing the report, it would echo those words.  Actual exploration is not safe.  Actual exploration does not take place on powerpoint slides.  Actual exploration takes courage.  Actual exploration take action.  Actual exploration requires going.

 

Actually going is  sine non qua.

Factors of Safety

     Old joke:  “You see the glass as half empty, I see the glass as half full, but an engineer sees the same glass and says ‘it is overdesigned for the amount of fluid it holds.’”

 

     When an engineer starts out to build something, one of the first questions to be answered is how much load must it carry in normal service?  The next question is similar:  hom much load must it carry at maximum?  An engineer can study those questions deeply or very superficially, but having a credible answer is a vital step in at the start of a design process. 

 

     Here is an example.  If you design and build a step ladder which just barely holds your weight without breaking, what will happens after the holidays when your weight may be somewhat more than it was before you eat Aunt Martha’s Christmas dinner?  You really don’t want to throw out your stepladder in January and build a new one do you?  Obviously you would should build a stepladder that can hold just a little bit more.  Don’t forget what might happen if you loan your stepladder to your coach-potato neighbor who weighs a lot more than you do?  Can you say lawsuit?

 

     So how do you determine what your stepladder should hold?  Do you find out who is the heaviest person in the world and make sure it will hold that person?  Probably not.  Better, pick a reasonable number that covers, say, 95% of all folks, design the ladder to that limit and put a safety sticker on the side listing the weight limit.  Yep, that is how most things are constructed.

 

     But that is not all.  Once you determine normal or even the maximum load it is a wise and good practice to include a “factor of safety”.  That means that you build your stepladder stronger than it needs to be.  This helps with the idiots that don’t read the safety sticker; it also helps protect for some wear and tear, and it also can protect if the actual construction of your stepladder falls somewhat short of what you intended.  So you might build your stepladder with a FS of 2.  That would cover 95% of all folks with plenty of margin for foolish people that try to accompany their friend climbing the ladder; or when your ladder has been in service for 25 years (like mine), or when your carpenter buddy builds the stepladder with 1/4” screws rather than ½” screws like you told him to.

 

     Factors of safety are not pre-ordained.  They have been developed over the years through experience and unfortunately through failures.  Some factors of safety are codified in law, some are determined by professional societies and their publications, and some are simply by guess and by golly.  Engineering is not always as precise as laypeople think.

     

     It’s a dry passage but I’d like to quote from one of my old college textbooks on this subject (Fundamentals of Mechanical Design, 3rd Edition, Dr. Richard M. Phelan, McGraw-Hill, NY, 1970, pp 145-7):

 

“ . . . the choice of an appropriate factor of safety is one of the most important decisions the designer must make.  Since the penalty for choosing too small a factor of safety is obvious, the tendency is to make sure that the design is safe by using an arbitrarily large value and overdesigning the part.  (Using an extra-large factor of safety to avoid more exacting calculations or developmental testing might well be considered a case of “underdesigning” rather than “overdesigning.”)   In many instances, where only one or very few parts are to be made, overdesigning may well prove to be the most economical as well as the safest solution.  For large-scale production, however, the increased material and manufacturing costs associated with overdesigned parts result in a favorable competitive position for the manufacturer who can design and build machines that are sufficiently strong but not too strong.

            As will be evident, the cost involved in the design, research, and development necessary to give the lightest possible machine will be too great in most situations to justify the selection of a low factor of safety.  An exception is in the aerospace industry, where the necessity for the lightest possible construction justifies the extra expense.”

            “Some general considerations in choosing a factor of safety are  . . . the extent to which human life and property may be endangered by the failure of the machine . . . the reliability required of the machine . . . the price class of the machine.”

 

            Standards for factors of safety are all over the place.  Most famously, the standard factor of safety for the cables in elevators is 11.  So you could, if space allowed, pack eleven times as many people into an elevator as the placard says and possibly survive the ride.  For many applications, 4 is considered to be a good number.  In the shuttle program the standard factor of safety for all the ground equipment and tools is 4.  

 

            When I was the Program Manager for the Space Shuttle, there were a number of times when a new engineering study would show that some tool either could be exposed to a higher maximum load than was previously thought, or that the original calculations were off by a small factor, or for some reason the tool could not meet the FS of 4.  In those circumstances, the program manager – with the concurrence of the safety officers – could allow the use of the tool temporarily – with special restrictions – until a new tool could be designed and built.  These “waivers” were always considered to be temporary and associated with special safety precautions so that work could go forward until the standard could once again be met with a new tool.

 

            In the aircraft industry, a factor of safety standard is 1.5.  Think about that when you get on a commercial airliner some time.  The slim factor of safety represents the importance of weight in aviation.  It also means that much more time, engineering analysis, and testing has gone into the determination of maximum load and the properties of the parts on the plane.

 

            For some reason, lost in time, the standard FS for human space flight is 1.4, just slightly less than that for aviation.  That extra 0.1 on the FS costs a huge amount of engineering work, but pays dividends in weight savings.  This FS is codified in the NASA Human Ratings Requirements for Space Systems, NPR 8705.2.  Well, actually, that requirements document only references the detailed engineering design requirements where the 1.4 FS lives. 

 

            Expendable launch vehicles are generally built to even lower factors of safety:  1.25 being commonplace and 1.1 also used at times.  These lower factors of safety are a recognition of the additional risk that is allowed for cargo but not humans and the extreme importance of light weight.

 

            It is common for people to talk about human rating  expendable launch vehicles with a poor understanding of what that means.  Among other things, it means that the structure carrying the vast loads which rockets endure would have to be significantly redesigned to be stronger than it currently is.  In many cases, this is tantamount to starting over in the design of the vehicle.

 

            So to the hoary old punch line:  Would you want to put your life on the top of two million parts, each designed and manufactured by the lowest bidder?

Black Zones – part 1

In the 1950’s it seemed like almost all of our rockets exploded during the launch.  There were a lot of spectacular failures in those days and successes seemed rare.  As we considered putting a man in a capsule on top of one of those rockets it was obvious that something was needed to get the pilot out of a bad situation in a hurry.

During the Gemini program, that method of “crew escape” consisted of ejection seats which were only slightly modified from those found in that era’s military jet fighter aircraft.  This left a lot to be desired as we shall see.

But Max Faget, the innovative genius behind much of the engineering progress in NASA’s early days, had a brilliant idea.  He invented something called the launch escape rocket system.  A cluster of solid rockets attached to the top of the crew capsule could be activated in an emergency to pull the capsule and crew away from a disaster and let them use their normal recovery parachutes to land safely. 

This was such a good idea that even other countries adopted this plan.  On September 26, 1983, with their rocket exploding below them on the launch pad, the crew of Soyuz T-10-1 was whisked away from almost certain death to fly again another day.  Gennady Strekalov and Vladimir Titov owe their lives to Max Faget . . .and a whole bunch of Russian rocket designers who built that launch escape system for the Soyuz spacecraft.

So Mercury and Apollo and the in-design Orion spacecraft use Launch Escape towers.  In fact there is a test of the new launch escape rocket system for the Orion scheduled for next week out in Utah. 

The shuttle, of course, adopted a different philosophy; a philosophy that, like a commercial airliner, “passenger” safety was provided by bringing the entire ship home safely.  More about that in a later post. 

Today I want to talk about ejection seats.  Gemini had ejection seats and so did the shuttle for the initial flights.  I don’t know much about the Gemini seats but the shuttle ejection system used on the first four flights was the best there was at the time.  And it wouldn’t have done much good.

The shuttle ejection seats were taken from those used on supersonic military aircraft.  Ejection at supersonic speeds has always been dangerous, probably life threatening.  It is best if the ship holds together to get to subsonic speeds where survival is much more likely.  At supersonic speeds, hitting the airstream is like hitting a brick wall.  Not good.  It may be the best option if you are facing certain death by riding a disintegrating ship, but even then it is not a great choice.  The shuttle ejection seats were really there for the late stages of landing.  If that big glider of an orbiter couldn’t make it to the runway, better to eject and bail out than try to crash land on rough territory.  In that scenario having ejection seats actually made sense.  In a later post I’ll talk about the entire entry regime, but just note that from the altitude of about 100,000 ft or lower and speeds from Mach 3 on down, the seats would probably have worked as advertised.  An ejection at, say, 10,000 feet and subsonic speed would have been a very good bet in a that situation.

How about using the shuttle ejection seats on ascent? 

Not good. 

For example; an ejection on the launch pad would not get high enough for the parachute to open in time.  Yep, you’d hit the ground from a few hundred feet altitude with the chute still unfurling.  Not recommended.  If your rocket was in the process of blowing up (remember Titov and Strekalov?) the blast overpressure would still be fatal at the distance the ejection seat would push you.  As a final insult, the “landing” would be in the flame trench.  So, an ejection off the launch pad was not a good idea for a shuttle crew.

During ascent, the capcom made the call “negative seats”.  This occurred as the shuttle climbed above 80,000 feet.  At that altitude the ejection seats would still work, and the pressure suit had sufficient oxygen get back down so you may ask, why was that a limit?  Because an analysis of the speed and trajectory above that point resulted in enough air friction heating to melt the plastic faceplate of the helmet.  And probably other things we didn’t analyse.  But the basis for the call was the melting of the faceplate.  So about 90 seconds into flight the ejection seats were useless and until at least 10 seconds into the flight there was not enough altitudefor the chutes to open.  So if you ejected in those “safe” 80 seconds?  Toasted by the solid rocket booster plumes going past you.  If the stack held together and didn’t have “an overpressure event” or send shrapnel headed your way.

Nope, ejection seats during shuttle launch was not a good way to get out of a tight spot. 

The Gemini situation was probably better in some ways, but still not great. Some retired Gemini engineer will probably post a comment with that information.

So all you future rocket designers please note:  launch escape rockets are the way to get out of a bad launch situation.

Of course, the best thing is that your rocket should never to explode.  But what are the odds of that?

Stay tuned for more discussion of this fascinating subject.