Louise Smyth meets the author of a new book on the Deepwater Horizon oil spill.
The 2010 Deepwater Horizon disaster in which 11 people died, dozens were injured and around four million barrels of oil were spilled into the Gulf of Mexico, has gone down in history as one of the worst events of its kind.
When the crew of Transocean’s Deepwater Horizon floating drill rig lost control of the Macondo oil well and the escaping gas and oil ignited, this was the culmination of a staggering catalogue of errors. It was, as Earl Boebert and James Blossom describe it, a failed engineering project.
The two senior systems engineers have written a book that provides a highly detailed account of the oil spill. In this book, entitled Deepwater Horizon, a systems analysis of the Macondo disaster, the authors challenge the commonly accepted explanation that the crew, operating under pressure to cut costs, made mistakes that were compounded by the failure of a key safety device. Instead, the book reveals that the blowout emerged from corporate and engineering decisions that combined to create the disaster. It provides a fascinating and thorough account of the disaster that explores the complex relationship between technology, humans and their interaction.
Destined to fail
When planning the project, its operator BP had estimated the cost to drill at $120.6million and for it to take 98 days. Yet barely anything actually went to plan. The book explores how both technical compromise and what the authors label ‘go fever’ united to create the disaster. It also highlights how both the original drilling plan and its peer review failed. Part of the reason for such failure is bluntly explained in the book: “’Risk’ clearly meant the same thing to the [safety] reviewers as it did to those who produced the register: risks to the timely and economical completion of the project, not to the Horizon, the lives of those aboard or the ecology of the Gulf.”
It’s unsurprising that an insufficient plan didn’t work out well. But this was far from the only problem. The book also explores the delays that led to the crew being put up against a far tighter deadline than planned, and mentions equipment that was ageing, in need of repair or just plain not fit for purpose. It also cites a more human factor, that “the crew of the Horizon had a tendency toward complacency when not actually ‘making hole’.”
It’s well worth reading the book to get to grips with the immense technical detail it covers with regard to the drilling operations side of things. And it’s also worth asking what we can learn from this disaster to prevent such engineering failures in the future. Which is where the book’s co-author, Earl Boebert, comes in.
When asked to try and break down such a complex scenario into just a few bullet points on the key engineering lessons to be learned, Boebert proffers a plain-speaking answer. “Actually, we think there is just one lesson: over the decades engineers concerned with emergent properties such as safety have evolved a process that for lack of a better term can be called ‘engineering discipline’. This process involves planning, reviews, testing, management of change, updating of risk registers and other techniques to maintain control of the interactions between elements. The lesson of the Macondo disaster is that in high-consequence environments, ignoring this process places everybody in that environment in very great peril.”
In this case, though, there was a huge list of factors that contributed to the disaster and it’s clearly difficult for firms to manage so many variables. Will there always be weak links in huge engineering projects such as this? “Yes, and that is why you have to be diligent in seeking them out and compensating for them,” says Boebert. “The first place to look is at the interfaces, especially between different technologies. And it’s important to understand that there is more to systems integration than just putting all the boxes in the same room.”
On reading the book, one thing that’s truly shocking is the sheer volume of major errors that led to the disaster. When other comparable disasters occur, we tend to hear of one almighty mistake – such as somebody lighting a cigarette during a gas leak – but this incident featured major errors across pretty much every conceivable parameter that can be measured. Surely that is not common? Boebert says that, “It’s not common, but it happens. Examples are Chernobyl, Fukushima, the Korean ferry disaster, and the Texas City refinery explosion.”
The skills gap
The human factor was evidently a contributor to the overall disaster – and part of the explanation here links into a worrying trend across the oil & gas engineering sector; that of the growing skills gap occurring as older, experienced engineers leave the sector, and are not being replaced by enough smart new minds. Boebert acknowledges the intuition of drillers – acquired through their vast experience on rigs – in the book. And he also offers some thoughts on how to counter this skills gap. “We think it would be very useful if some industry entity - API, a university, whatever - would institute a ‘knowledge capture’ project that went out and interviewed senior and retired drillers and put those interviews up on the internet,” he details. “In the trial evidence there is one long deposition that is essentially a masterclass in well control. The emails from our peer review team contained a wealth of real-world information that you won’t find in textbooks. This stuff needs to be saved.”
A more cynical way of looking at this is to ask whether we can in fact remove the humans from the equation entirely as a way to mitigate their negative influence on projects. Does Boebert believe that a move towards more automation and less human intervention will lead to fewer mistakes being made? Not exactly, he says. “If the automation is done in the slow, painstaking way that flight control software is developed and applied to individual solutions, then it should be a help. If it is done in the ‘first to market’, ‘disruptive’ approach beloved of Silicon Valley, it stands a good chance of making things worse.”
Learning from our mistakes
In the book, Boebert and Blossom discuss that if we humans have any chance of learning from past mistakes and accidents then more data recording is needed across the drilling sector. The challenge here is how to persuade the likes of BP to invest in such technology. Boebert explains that, “The investment in a survivable forensic data recorder is minimal, as is shown by their widespread application everywhere else. It’s really just a question of will.
“It would be interesting to hear from vendors why cyber chairs or blowout preventer (BOP) controls are not equipped with ‘black boxes’ like the voyage data recorders attached to the bridge of these vessels. It’s pretty amazing that in the 21st century experts have to expend hundreds of hours trying figure out from minimal pressure, pit volume and flow telemetry whether a particular valve was open or closed - and in the end not be able to conclude with confidence.”
The unknown factors
Boebert’s book also highlights the fact that as well as the facts we do know about, there were a number of unknowns at play in the Deepwater Horizon disaster. How can we learn how to prevent mistakes if we’re not entirely sure what all the mistakes were? Boebert states: “This is where systems engineering practices such as the Concept of Operations come in. Instead of trying to counter past mistakes one by one (a process known in computer security as ‘patch and pray’), you try and build a sound structure that makes it hard to make a mistake. It would be interesting to try a ‘Drillship 2200’ project: start with a clean sheet of paper and design a hypothetical system using the top-down approach. And then compare and contrast with what’s being done now. I get the impression (possibly mistaken) that the industry just assumes that their traditional bottom-up approach will be adequate to handle the risks of ultra-deep HPHT wells. That could be a dangerous assumption.”
The blowout puzzle
The following is an extract from the book:
“Data from the Sperry Sun system suggests that the production casing passed the positive pressure test. However, about four hours later, when the crew began the first displacements of spacer and seawater, some fluid began flowing out of the production casing and into the formation – contradicting the results of the positive pressure test. These lost returns continued through the second displacement until the well became unbalanced and hydrocarbons began flowing into the production casing.
This leaves us with one of the central puzzles of the blowout, one that any acceptable explanation of what happened at the bottom of the well must solve: what changed in the well between the positive first pressure test and the first displacement? We will probably never know.”