With the investigation into the underlying cause of the BP oil spill in the Gulf of Mexico underway, people are starting to scrutinise the software systems that controlled the blow-out preventer amongst other things. What steps can be taken to make sure control systems are foolproof?
As always happens, news media have lost interest in a story about which they were totally obsessed only a few weeks ago. The catastrophic effects of the BP oil spill in the Gulf of Mexico will be felt for a long time, perhaps for more than a decade, but the media have moved on to the next headline. Maybe the long term effects aren’t as obvious and dramatic as a flock of oil-sodden sea birds struggling pathetically to survive in their ruined habitat. They are felt by the proprietors of, and workers in, a devastated tourist industry. They are felt by pensioners whose investments are shrunk by the need to divert billions from what would have been profits into reparations and damages. They are felt by all of us for whom prices will go up as a result of a diminished appetite for deepwater drilling.
Although the media may have moved on, more responsible interested parties will be spending a long time and a lot of effort trying to figure out what caused the Deepwater Horizon explosion in April 2010, an explosion lest we forget that not only caused an environmental disaster but also claimed the lives of 11 people. Perhaps, despite their best efforts, investigators will never be able to tell us what happened, in which case we’ll simply have to be satisfied with speculation, or educated guesswork.
Such speculation has started already which has certainly struck a chord with us. How many people know what’s involved in drilling the seabed for oil? Far from being a simply mechanical process, it actually depends on a lot of software-intensive control systems. It’s not widely appreciated, but most of the sophisticated technology that shapes all our lives depends on a lot of software. Sometimes, software failures are an inconvenience. So you had to restart your PC? Big deal. How about if the pilot’s ‘glass cockpit’ packs up in the middle of your holiday flight. That gives a whole new meaning to the ‘blue screen of death’!
In the case of Deepwater Horizon, it’s clear from the Transocean interim report to the Waxman committee that control system software is falling under suspicion.1 Reports have already surfaced in the Houston Chronicle2 that “display screens at the primary workstation used to operate drill controls on the Deepwater Horizon, called the A-chair, had locked up more than once before the deadly accident.” Given the amount of embedded software in oil rig systems, or the dozens of operations that are carried out under software control, it’s no wonder that software is getting the third degree.
Software is relatively easy to write. Reliable, safety-critical software can be complex and challenging: however in truth often it isn’t much harder to write than the software that’s powering the browser that you’re probably using to read this. To really get close to perfection, it requires independent testing so that the developers’ assumptions, and even egos, are not allowed to stand in the way of the quest for those last few elusive bugs.
Independent testing of something that’s already been tested in the normal way, by its developer, is undeniably an extra expense. It’s not a prohibitive expense though – just the one that’s most likely to be cut when money’s tight and financial control is wielded by those that don’t really understand the true value of what they’re cutting. Our experience is that cost pressures are all too often allowed to bear on the safety-critical part of the software development process. Do we skip physical safety checks on trains and boats and ’planes? Not likely! So how is it OK to let finance directors and others of their ilk cause the cutting of corners when it comes to the more abstract and less tangible factors in the safety equation?
Until independent testing, by truly qualified testers, is recognised as sacrosanct within safety-critical developments then we’ll continue to have aircraft falling out of the sky, runaway cars, space-launch disasters and yes, oil rig disasters. At the dawn of a new era of nuclear power generation, it’s time to start changing attitudes now.
Author: Brian Luff, Chairman, Critical Software
1. "Deepwater Horizon Incident—Internal Investigation," draft report, Transocean, 8 June 2010, p. 15; http://energycommerce.house.gov/documents/20100614/Transocean.DWH.Internal.Investigation.Update.Interim. Report.June.8.2010.pdf.
2. B. Clanton, "Drilling Rig Had Equipment Issues, Witnesses Say—Irregular Procedures also Noted at Hearing," Houston Chronicle, 19 July 2010; www.chron.com/disp/story.mpl/business/7115524.html.