04 Sep The Single Point of Failure that IT Architects probably never considered…
I spent almost 20 years working in the intense and fast moving industry that is IT, selling UNIX based systems across a range of major industry sectors. During that time, one of the key drivers my customers was to maximise the reliability and stability of the system we were providing from end to end and to achieve this, the mantra was to avoid “Single Points of Failure” at all costs – and in most cases, cost it certainly did.
We had multiple processors, memory, disks, IO connections, cooling fans, power supply units, UPS units and multiple reserves of pretty much every conceivable component within the data centre.
What was given very much less attention, and indeed in my experience, pretty much ignored, were the risks associated with power loss. The data centres were all protected by rows of UPS batteries which would power down in an elegant manner if really required, but in reality, the UPS systems were only really expected to bridge any gap between a power failure and the back-up generator systems firing into life.
These generator back-up systems lived down in the basement, and were very much on the “outside” of the data centre, usually under the management of facilities, rather than IT, and often outsourced to a building Facilities Management company as part of the everyday business of managing a major modern building.
In short, we just assumed that if the power went off, as long as the UPS systems could run for a few seconds, or at worse minutes, the generators would roar into life and power would flood back through the wall and keep the lights flickering on the vast server farms.
In those 20 years, and countless customer implementations, I can’t remember a case of us considering the risk of the generator failing to power up. I think this was simply the fact that it sat outside of IT, was not very technologically sexy and was just not on the radar of “Single Point of Failure””
Having left the industry several years ago and moved into the Energy industry, specifically the provision of fuel and oil, I now see the issue from the other side of the data centre wall. What I realise now, but didn’t at the time, is that the fuel sitting in standby generators changed in April of 2008 and again in 2011, creating some significant shift in risk for those needing to guarantee long term quality of the fuel.
One of the last actions of Tony Blair before leaving office was to agree to strict reductions in the UK use of fossil fuels for transport and energy generation. This EU wide agreement became law in 2008 with further changes in 2011 with one of the key ways to reduce fossil fuel usage by simply adding an average of 5% of “non-mineral” diesel or petrol – principally vegetable based biofuels. To give the fuel refiners some wiggle room, they allowed up to 7% of biofuel to be added in any fuel, but the average leaving the refinery had to be 5%.
This means that our cars, buses, lorries, tractors, and stand by generators all now burn an average of 5% less fossil fuel, which most people would probably think to be a good thing.
The bio fuels are pretty much identical to the mineral versions in terms of how they burn, they are a bit less friendly to various rubber and steel components, but diluted down at this level, the biofuels have been successfully added to the UK network over the past few years with little or no impact.
However, there are a few properties of the bio fuel mixes that can cause a few headaches in certain applications. The first issue is that the fuel is less stable than its mineral cousin. The older diesel based fuels could be left for long periods and still be relied upon to burn as expected, but the bio fuel will start to separate and laminate after a few months of storage. An even bigger issue is that the bio element has a very high propensity to attract moisture from the atmosphere, so over time, water builds up at the bottom of tanks if the fuel is left for any significant length of time. This water also promotes the growth of an algal bug, which is naturally present in the diesel fuel but clumps together to form a black, slimy sludge which slowly grows in the bottom of the tank.
Now for most fuel applications, the fuel simply does not sit long enough for this to be a problem. Vehicles, plant equipment and some heating systems are all used regularly and refuelled a number of times per year, but standby generators are a rather different beast. Most large organisations that have high risk operations that depend on electricity will have significant standby generation facilities on site. Hospitals, prisons, banks, corporate data centres, airports, emergency services, major retail units and large office blocks will all have major generators either down in the basement, or tucked away on campus. These will generally have a reasonably large fuel tank so that the generator can run for a few days in the worst case.
The problem is that they hardly ever get used. Good housekeeping will ensure that the generator is fired up on a monthly, or quarterly basis and runs for a short while, a box is ticked on a sheet, and then the machine is switched off to wait for the day in which it is needed in anger. The fuel usage for these tests is tiny, and it could be that less than 1% of the fuel in the tank is used in any single year.
And all the time, the bio mixed diesel sits, attracts water, laminates and allows black sludge to build up in the bottom of the tank. The problem is that if this situation is left unchecked, the first time that it is discovered is when the generator is called upon to do some real work, in other words in an emergency – just when you don’t need it to fail due to blocked fuel filters or damage to the high pressure injectors within modern diesel engines that are the heart of the generator.
There is good news, however, in that removing the contaminated fuel which is almost guaranteed to be sitting at the bottom of large fuel tanks is relatively simple, and compared to removing single point of failure from the systems on the other side of the data centre wall, is a much lower cost.
The first stage is to undertake a sample of the fuel in the tank – this should be from multiple levels which also identifies varying levels of contamination within the fuel (water is more dense, so most of the contamination is at the bottom of the tank). Once this is done, the water and sludge level can be removed via a vacuum pump, classified as waste and disposed of. The remaining fuel can then be put through a filtration and conditioning process until it reaches the levels of quality that would be expected from freshly delivered fuel.
We have seen instances of very high risk environments replacing the entire fuel stock each year to minimise the risk of failure, but in the vast majority of cases, and we have done hundreds of tests and treatments across the UK within NHS, MOD, Retail, Finance, Ministry of Justice and a range of other high risk environments, removal of the higher contaminated fuel and filtering of the remaining fuel has been more than sufficient to keep the fuel within specification.
So what should IT staff do to make sure that the generator systems are fit for purpose in the case of an emergency? Well a simple fuel test costing a few hundred pounds will give you a complete analysis of levels of contamination within your generator tank. These should be done annually to keep an eye on the issue before it causes a major problem. If you need your fuel testing, contact us and we will let you know if you are at risk – and if you are, we can send the team into clean up the fuel.
Chris Bingham – LCM Environmental (Craggs Environmental Ltd), CEO