FT : Data Deluge
Thursday, March 12, 2009
When George W Bush left the White House in January, 140 terabytes of data – a terabyte being 1,000 gigabytes – were transferred to the National Archives. The electronic legacy of the Bush administration was 50 times larger than the data archive left by Bill Clinton just eight years earlier. IT experts are confident that, at the end of Barack Obama’s administration, the archives will run to many petabytes (1,000 terabytes) of data. But dealing with mounting volumes of data storage is by no means a problem that is limited to governments. Businesses generated between 30 and 35% of the estimated 281 exabytes (1,000 petabytes) of data created worldwide in 2007, according to the Storage Networking Industry Association, but businesses are responsible for managing as much as 85% of the total. “The world is digitising at a faster rate than ever, irrespective of the economic or political climate,” says Mark Vargo, chief strategy officer for systems storage at IBM and a member of the global strategy team. “We have seen medical scans that were 1MB and are now 1GB – a 1,000-fold increase. You might get better healthcare from that, but there is also more data, and the hospital’s IT system has to store it.” Similar or even greater increases are to be found in industries from financial services to telecoms, suggests Mr Vargo. Not only are businesses gathering more data, individual consumers are creating more digital information and storing more copies, in more places. Governments and regulators, for their part, are requiring organisations to keep more information, and for longer. The EU Data Retention Directive, which is currently being applied to internet service providers, will require companies to keep large tracts of information about e-mail and web traffic. In fields such as medicine and pharmaceuticals, longer life spans are forcing organisations to keep test and other data for longer. Among all businesses, the need to keep records in case of a legal investigation, as well as to make back-up copies for disaster recovery, means that the same pieces of information are being stored many times over. It is widely accepted by the IT industry that a typical PowerPoint presentation is stored at least seven times across a company’s computer systems. “Businesses are asking for more storage capacity, and capacity cannot keep up,” explains Joe Tobolski, a senior executive at Accenture Technology Labs…..Data management tools, especially for larger organisations with dedicated data centres and specialist storage management teams, have improved. The move from holding data on disk drives built into servers to storage area networks – where the drives are shared between a number of server systems – has gone a long way to coping with data growth. “Without storage area networks, the situation would have been a lot worse,” says IBM’s Mr Vargo.
But statistics from the large IT companies suggest that, on average, large businesses only use around 20 to 30% of their available storage capacity, even with storage area networks. Most modern IT systems can function well enough with as little as 30% or even 20% of spare capacity. To increase utilisation, IT departments are introducing more sophisticated techniques. Storage virtualisation – where software creates pools of storage that, as far as an application is concerned, have the appearance of physical disk drives – and data de-duplication, where those multiple versions of presentations and spreadsheets are reduced to one, or possibly two, copies. But with IT budgets under pressure and seemingly no end in sight to the demands for more storage, even such sophisticated measures might not be enough. For most large enterprises, the IT department will have already introduced the most effective efficiency measures. Typically, an efficient IT department will have undertaken storage consolidation, by moving to higher-capacity drives, increased storage utilisation, attempted to reduce the data held, through de-duplication, and looked at how much active data is stored on systems, according to Hu Yoshida, chief technology officer of storage vendor Hitachi Data Systems. Techniques such as moving “stale” data to cheaper systems – either lower-speed disk or tape – at least ensure that storage costs come down, he says. “Once stale data is put into a managed archive, IT operations no longer has to worry about it,” he points out. Once these measures are in place, however, the only real option open to the CIO is to tackle the demand for storage. “The business doesn’t understand that [storage] is not just a bunch of disks and there are some quite complex processes in the background to move the data around, and especially to ensure resilience,” says Alastair McAulay, a senior IT consultant at PA Consulting Group. But at the same time, there is growing awareness, at least, of the need to safeguard and secure corporate data, he says…..If the organisation has no specific legal or regulatory requirement to store the information, it may well be cheaper and safer to destroy it, suggests Mark Lagodinski, a senior manager in the technology and security risk services group at Ernst & Young. “In the current environment, the priority for records management is around cost reduction. Are we retaining more data than we need, and can we make some decisions about disposing of data?” he says. “Retain what you need to retain, but information that does not have business value or a regulatory or legal requirement to be kept, should be disposed of.”…..But other organisations are considering still more radical steps, such as charging business units for data stored, or even outsourcing storage altogether, to a “cloud” provider offering storage via the internet. Companies such as Amazon, EMC, Symantec and most recently, archive specialist Iron Mountain, have moved to offer online data storage, or vaults, connected to customers via the internet. CIOs have concerns, though, about whether security and service levels for cloud-based storage would be adequate for critical business data. “Companies will not be cavalier about data that is truly critical,” says PA Consulting’s Mr McAulay. “But there will be greater appetite for the processing side of cloud computing, to deal with peaks in demand. But they are making sure they are holding on to their data. Storage is too important, and for now too difficult, to move over to a cloud architecture.”