Inner space still expanding
Don't be daunted by the unceasing data deluge: emerging advanced storage technologies are ready to help mop up the flood.
A crunch in data storage, unlike credit, has been long predicted but never seems to quite materialise. This is partly because ongoing improvements in disk density and cost per bit have matched the inflationary growth of storage itself (see 'generation games', p58) until recently. Of one prediction we can be certain, that this felicitous situation cannot continue. There are already signs that the increase in disk drive densities and costs per bit are decelerating.
Meanwhile, enterprises are facing rising storage costs, of which the actual hardware only accounts for around 25 per cent. The increasing amount and complexity of data, of which a rising proportion is unstructured information such as emails, has led to higher storage management costs, and also greater 'information latency', or navigation time (finding and retrieving required data).
Coping with these demands has kept many enterprises locked into a fire-fighting approach to storage, with a 'rip and replace' culture predominating over a strategy of incremental deployment. It is often hard to escape from this trap because data just keeps on accumulating at ever greater rates, as was noted by Greg Papadopoulos, CTO and executive vice president R&D of Sun Microsystems, whose acquisition of StorageTek in 2005 turned the company into a major force in storage systems.
"Unlike other IT infrastructure decisions such as increasing server capacity or considering network upgrades, the capability to store data is not optional," Papadopoulos says. "Enterprise data simply cannot just fall on the data centre floor."
Against this background, with the objective being to serve applications as efficiently as possible for the least cost, remedies fall loosely into two classes: storage hardware, and data management/organisation.
Both are equally important, according to Peter Williams, practice leader for IT infrastructure management at Bloor Research, because hardware technology alone merely creates more data that, in turn, needs new tools and strategies to manage effectively: "Ultimately, the new technologies need to simplify management because the greater the volumes the more difficult they are to manage," he says.
Some observers, such as Forrester Research senior analyst Andrew Reichman, go further by suggesting that hardware technology will never tame the data mountain, and that users' habits will have to change. "Call me a pessimist, but I just do not see hardware density ever catching up with the explosion of data. I think efficiency and organisation are always going to be important."
One important change, Reichmann believes, will be that innovations no longer come just from storage vendors, with applications vendors increasingly calling the shots: "I think that applications know more about data context and usage, so innovations in data management are likely to come from, or at least be closely tied to, the applications that manage and move data, rather than storage vendors who sit on the outside, and don't know much about the importance of the data within their systems."
In the meantime, the spotlight is on storage vendors, with the recession increasing pressure on them to improve use of available capacity, and reduce costs.
Virtualisation, thin provisioning and de-duplication
On use, the main prongs of attack are virtualisation, thin provisioning, and de-duplication, with the major storage vendors offering solutions spanning all three.
Virtualisation has been on the table longest, achieving an aggregate saving in space by allocating it on demand, mapped across several disk drives, exploiting all the available capacity throughout a storage network. It can be combined with hierarchical storage to optimise cost and performance, keeping live 'hot' data on the fastest drives, migrating data accessed less often to slower, fatter, less expensive devices.
With virtualisation alone, however, once capacity is allocated it is locked out to other applications or users, even if is not fully used - which is obviously not optimally efficient. Thin provisioning fixes this by only allocating space the moment it is needed. "The value to customers is that they do not have to purchase nearly as much spare capacity so they can, for instance, buy hardware more incrementally," explains Bloor's Williams.
"Also, it carries out space allocation automatically, reducing the management overhead." The only downside to consider is that thin provisioning, by its nature, uses disks to the full, which can increase contention and cramp performance. A compromise may be needed, with 'fatter' provisioning for high performance live applications.
De-duplication can achieve a huge space saving by storing information only once, with the additional advantage of eliminating inconsistencies and improving data integrity. But, as with thin provisioning, it can increase contention, since different applications are now sharing the same data.
While these technologies will help reduce storage hardware costs substantially, their impact on total cost of ownership (TCO) will be less impressive, according to Claus Mikkelsen, chief scientist at Hitachi Data Systems (HDS). "Disk price is only 25 per cent of the total cost of owning the disk over three to four years," Mikkelsen estimates.
Identifying the other 75 per cent of TCO made up by management and software development costs has been difficult, leading HDS to develop a model for quantifying TCO and aiming to reduce some of the underlying cost components.
"We employ Storage Economics tools to identify and characterise the remaining 75 per cent of the TCO, and to present how new storage architectures can be instrumental in improving storage costs with excellent return on investment," says Mikkelsen. "This methodology is about helping customers reduce long-term storage ownership costs and risk and ultimately getting a better return on assets."
Mikkelsen believes that the field of TCO tools will be fertile over the next few years, with HDS taking an early lead.
IBM argues that there is more scope yet for reducing the TCO through storage technology, and in particular by unifying it with mainstream IT. An important development will be the convergence of the SAN with the LAN and the WAN, according to Steve Legg, IBM's CTO of storage.
SAN and LAN convergence is being driven by the IEEE's Convergence Enhanced Ethernet (CEE) project, comprising a series of additions to the protocol for incorporating fibre channel SANs within future 10GbE networks, according to Legg. Since Ethernet is also becoming a predominant data transmission mechanism for the WAN, the CEE project will drive convergence here too, enabling the iSCSI storage protocol to be extended over long distances.
IBM also makes the point that application trends need to be considered when plotting future storage technology.
If the only issue were proliferation of data, then the world would continue with disk drives as they combine online access with low cost; but it is clear that disk drives cannot sustain ever-faster access speeds, and greater storage bandwidth. This is promoting solid-state storage for front-line applications, with disks gradually relegated to second tier storage duties.
The idea of replacing disk drives with solid-state 'drives' (they are not really 'driven' like hard drive platters are) in laptops, PCs, and even enterprise servers, is attractive: not only on performance grounds, but also because it would simplify the architecture and operating system, with just one type of storage to cope with in the box.
The drawback is cost. Solid state memory costs about $2 per Gb, compared with $0.38 for disk drives. That difference is narrowing, but must come down more to make solid state memory attractive, according to Mark Watkins, senior technologist specialising in storage at HP. "There is some evidence to suggest that solid-state density improvements are increasing relative to magnetic disk, and if this is the case then over the next decade solid state drives will replace all magnetic drives when the price per Gb is probably within three-times," predicts Watkins. "It doesn't have to be the same or cheaper, because it costs less to run and is faster."
IBM is attempting to reduce this cost differential through its Racetrack project at its Almaden Research Centre, elegantly combining elements of solid-state and magnetic disk drives.
A Racetrack system still encodes data in magnetic domains and still requires a read/write head. But while the read/write head moves in a disk drive, in a Racetrack system, it is the domain that moves. This is achieved by using spin-coherent electric current to shift the magnetic domains along a nanoscopic wire only 200nm long and 100nm thick - dimensions only around 1,000 times greater than a hydrogen atom.
This small scale enables bits to be read at speeds comparable with solid-state memory and, with no mechanical moving parts, there is little wear and tear and storage densities are high.
Racetrack still does not address what some pundits describe as the impending storage crunch, as the rate of increase in magnetic storage densities slows down. "Since 2003 the growth rate for data density in magnetic disk drives has slowed to around 25 per cent to 30 per cent per annum, and it is expected to stay at that rate for the foreseeable future," says IBM's Legg. "This will increasingly create a gap between data capacity demand and density increase (therefore cost take down) supply."
In the immediate future, this gap will be filled by technologies such as de-duplication, and also data compression for video in particular, that cut down the number of bits produced by applications, says Legg, and may reverse the trend away from tape for lower storage tiers.
In the longer term, radical new technologies are required, based perhaps on associative storage using holographic techniques (see 'holographic storage', p57). This attraction here is not just a leap forward in capacity but also much faster search and retrieval of complex multimedia objects such as video and in future 3D holographic images.
It has long been known that holographic memory allows data patterns, such as an image of the title 'E&T', to be located almost instantly when stored as a holographic pattern on a crystal. This has huge potential for storing large databases in holographic memory in future, with exciting possibilities such as searching long sequences of video for a specific image.