Solid state drive innovation - data's new home?
Solid state drives offer extra capacity, speed - and a potentially bewildering range of specifications
Innovative: OCZ Technology’s VeloDrive PCI-Express card
STEC’s Zeus SSD works with Nextenta’s storage management software
Bill Ross, Nexenta: ‘As SSD pushes further into the data centre it will add complexity in several ways’
Solid state drives now offer an attractive storage alternative for enterprise users - but is the range of options and pace of innovation proving more of a headache than a help to IT specifiers?
Solid-state drives (SSDs) have now become readily affordable, but with manufacturers offering several different models each, how do you decide which is right for enterprise needs? And to help inform a purchasing decision, what is going on under the covers that makes them more interesting and/or different for business applications?
Current SSDs are but a first glimpse of the future of mass data storage, both for the desktop and in the data centre; but in actual fact they are also somewhat of a fudge: corseted into the trappings of a spinning hard disk, they are limited by the need to conform with what went before. The corset will eventually loosen up, but in the meantime a lot of innovative engineering is needed to extract the most out of where we have reached so far.
Examples include the VeloDrive PCI-Express cards developed by OCZ Technology and other vendors. These sidestep the SATA/SAS bottleneck by talking directly to the system bus, and can therefore hit data rates in excess of 1Gbit/s. After all, why bother with a 6Gbit/s disk protocol when there is no disk?
Then there are hybrids such as Seagate's Momentus family, combining comparatively high latency spinning disk for cheap capacity with Flash for faster access. Others include the use of digital signal processing to improve the readability of stored bits, and therefore give Flash a longer life - a leader here is Anobit, which was bought by Apple earlier this year for a reported $390m. Alongside this is the development of NAND Flash able to encode eight states in a single cell, slashing the cost-per-megabyte by allowing each cell to hold three bits.
What these all have in common is reliance on a sophisticated controller chip. Indeed, much of the differentiation between SSDs today - even between different consumer grades, never mind between consumer and enterprise drives - comes from which controllers and firmware they use. "The secret sauce is your controller - we are on the eighth generation of our RAMSAN controller," says Erik Eyberg, a senior analyst with enterprise solid-state storage pioneer Texas Memory Systems (TMS).
While some, such as TMS itself, and presumably Apple now as well, make their own controllers, others buy in chips from specialists such as Sandforce, Indilinx, and JMicron. Joost van Leeuwen, EMEA director of marketing for SSD specialist OCZ Technology, says that while a lot of the decision is price and speed, there are more subtle performance differences too.
"SandForce controllers, for example, use compression to push more data through the lanes in the hardware," he explains. "The downside is if your data is already compressed, such as JPEG, MPEG, MP3. Indilinx provides a much more consistent performance regardless of file type or size: they use a small DRAM cache to achieve high speed and do not rely on compression. This is a big difference for most types of use in multi-application segments."
Similarly, Louis Kaneshiro, an SSD engineer with Kingston Technology, says his company uses Sandforce in its high-performance lines, but switches to cheaper JMicron for its budget-priced notebook SSDs. "The controller is really the engine of the drive," he reports. "It has a lot to do with how efficient the wear-levelling is and the write-amplification, which is how much NAND is actually written to." Write amplification occurs because Flash can only be erased in pages, so if you need to update a file, you may first have to move current data to a new page, erase the old page and then write the updated file all over again.
Where there's a wear
Wear-levelling is the process of evenly distributing usage across the chip, and it is needed because the way Flash works means that each cell can only stand a certain number of erase cycles before it begins to become unreliable. So the controller must not only re-map bad blocks, it must keep track of empty pages, pre-emptively erase stale pages for re-use once it starts to run out of new ones, and avoid creating usage hot-spots. Incidentally, when you rewrite a page, time and power are saved by not erasing the original immediately; instead it is marked 'stale', and the data written to a new page. This perforce creates opportunities for data filchers and forensic analysts alike.
All this, and Flash's lack of mechanical latency, will make it seem very odd stuff to anyone or anything that expects spinning disk - including operating systems and applications. For example, says Kaneshiro: "You don't need defragmentation or write optimisation - the drive actually works better fragmented! If you defrag, you are messing with the wear-levelling."
Fortunately, modern versions of Linux, Apple MacOS, Unix and Microsoft Windows all understand SSDs and can adjust appropriately; and some SSD specialists, such as Plextor, have been developing proprietary techniques for working round some of the 'legacy' memory controller issues that date from previous-generation storage technologies. "We do a lot of benchmarks in the lab that I run," Kaneshiro says. "The operating systems all benefit, but we do see some taking better advantage than others - MacOS is very snappy with SSD, Linux is pretty nimble as well. Microsoft Windows 7 makes better use of SSD than Windows' XP and Vista combined."
He adds: "With older operating systems you would need to do more tuning, but as long as you've got AHCI [Advanced Host Controller Interface] support, you are stomping what a hard disk is capable of. The other thing though is that Windows is only going to go so fast, regardless of the SSD, the processor and so on. It's when we get to specialist stuff that you can get gains, for example with Adobe Premiere you might want more IOPS [I/O Operations Per Second]."
Flash's Achilles heel, so to speak, has been the endurance issue - how many erase cycles each cell can take on average. Sadly, this is not getting any better as the industry shifts to finer silicon processes - from 34nm to 25nm, and then under 20nm - and to chips that store multiple bits per cell. "The wear levelling problem has only gotten worse," says Eyberg. "As you shrink the die, you introduce more problems for the controller to deal with. Each extra level of writes requires more error correction. There is some capacity loss, but the bigger problem is the more error correction you apply, the more steps the controller has to go through and the larger it has to be."
Price points a driving factor
Of course, other varieties of solid-state storage exist and offer better endurance. Battery-backed dynamic RAM and MRAM are used in certain applications, for instance, while NOR Flash is preferred for embedded uses. There are other non-volatile memory technologies under development, too. However, most observers agree that SSDs based on dense and cheap NAND Flash will dominate the market for many years to come.
"Then with NAND you have different categories," notes Eyberg. "Single level cells (SLC) with one bit per location, or multi-level cells (MLC) with two or three bits per cell. You have commodity MLC which goes into mobile phones and the like, and enterprise MLC (EMLC) - the difference is the number of writes it can take."
"The grade of NAND is higher for SSD that for other Flash lines such as SD cards or USB," says Ariel Perez, Kingston's SSD product manager. "Early on we used SLC for enterprise drives, but SLC is more and more going by the wayside in favour of MLC and specifically EMLC."
"The newer vendors who try to push all-SSD [storage subsystems] are using MLC to get the price right," agrees Bill Ross, marketing VP with enterprise storage developer Nexenta. "That type of Flash will become prevalent, but it is a little slower and you will also see wear and performance issues appearing over time." Nexenta sources its SSD hardware from memroy engineering OEM STEC.
The trade-off between endurance, performance, and price is a clear one. "SLC can handle 100,000 erase cycles. Current MLC is 3,000 to 5,000 depending on the grade, and EMLC comes in between," says Perez. "SLC is still needed for some niche write-heavy applications - industrial applications that have 24x7 loads, for example - but EMLC with 30,000 is fine for most. Then three-layer cell TLC might have 1,500 or 2,000 erase cycles. Consumers demand the fastest and cheapest, so it is TLC, but I do not know any vendor using TLC for enterprise."
OCZ Technology's van Leeuwen adds that Flash can also be architected as synchronous or asynchronous memory. "Asynchronous is a more affordable type of flash mostly used in our value range: read speeds are comparable to synchronous Flash, but write speed is lower," he says. "Synchronous is more costly, but will be used in our performance models because it is more durable and has better write performance."
And van Leeuwen further argues that a few thousand erase cycles is plenty for light'to medium usage in the enterprise. He points out that if you write an average of 5GB per day, a 60GB SSD could take almost 100 years to reach 3,000 erase cycles, in his estimation.
Long before then though we can expect to see a lot more from Flash, as it finally breaks free of the hard disk heritage. As Kaneshiro at Kingston Technology puts it: "The SSDs we see now are the tip of the iceberg. Putting them in a 2.5in hard disk format was something we could all understand, but moving forward it will indeed be amazing what more we will see with Flash." *
Construction changes: SSD is driving system architecture rethinks
The arrival of SSD in the data centre promises more than just faster boot times. If you are currently stripping data across multiple hard disks for maximum I/O, enterprise SSD could do the same job with fewer drives. As well as reducing footprint, power consumption and, of course, noise levels, that could even make it the cheaper option.
"SSD is absolutely transformational," says Erik Eyberg, a senior analyst with enterprise storage developer Texas Memory Systems (TMS). "Most traditional enterprise applications are centralised, so we go in there with a 1U Flash array that takes all the I/O intensive stuff. When you figure out that I/O can be eliminated as a bottleneck, it is possible to go from a whole rack of equipment to one or two rack units."
One area where I/O has become a major problem is virtualisation, and especially desktop virtualisation, because when you consolidate workloads you also mix up the I/O from all the virtual machines, turning several sequential streams into one big very random one.
"In general, highly random workloads are where SSD media stands out, because it has no mechanical overhead," says Eyberg. "I see it transforming business processes. For example, if a bank used to run account reconciliations overnight, with TMS it can shrink that to an hour, so it can provide more accurate account information to customers."
However, as SSD pushes further into the data centre it will add complexity in several ways, warns Bill Ross, marketing VP at storage system designer Nexenta. Expensive all-SSD architectures might be preferred in areas such as financial trading, but for most data centres the standard model will be a hybrid of SSD and disk, he says, "so you'll probably have to deal with a couple more vendors, unless you have an unlimited chequebook and can afford EMC or NetApp prices".
He adds: "The architectures will get more complicated. You need tiers of memory or storage: sub-nanosecond within the system box, the cache can be a little slower, then SSD maybe for level-two cache, and in the storage system first SSD and finally disks for capacity. We'd recommend fast SSD for transaction logging, say, then cheaper SSD or hard disk."
Cheaper is a relative term though, and it doesn't mean you can use consumer SSD in the data centre, warns OCZ's Joost van Leeuwen. "Except that they both have the purpose of storing data, client SSD and enterprise SSD are completely different products," he says.
Ross adds that hybrid architectures also pretty much mandate an SSD-aware enterprise file system that handles bad block mapping well and with very little performance loss - Nexenta uses ZFS, for example - which means more to learn.
"You will require a greater knowledge of the file system, so you know where best to put what SSD you can afford," he says.
He concludes: "With enterprise file systems on SSD you really must understand the implications of extra features such as de-duplication - make sure you test before turning them on. All the time we see people who think the extra performance means they can turn on all these features, and are then surprised when performance tanks."
Timeline: a brief history of data storage technology evolution
1932 Magnetic drum - capacity of around 10Kb
1946 Selectron tube (capacity: 256-4096 bits/32/512 bytes)
1951 UNIVAC 1 UNISERVO magnetic tape storage device
1956 IBM RAMAC 305 hard drive. 5Mb / $10,000 per megabyte
1973 IBM 3340 'Winchester' drive: sealed assembly
1976 Dataram 'Bulk Core' SSD for minicomputers
1978 RAID (Redundant arrays of independent disks)
1980 Seagate ST-506 5.25in hard drive
1982 'Musicassettes' used for 'home' PCs
1983 Rodime RO352 3.5in hard drive
1985 Philips CM100 CD ROM drive
1990 NEC 5.25in SCSI SSDs
2006 SimpleTech SSD with USB interface
2012 OCZ demos SSDs capable of transfer speeds of 6.5 Gbit/s and 1.4m Input/output Operations Per Second
|To start a discussion topic about this article, please log in or register.|
"There has been a lot of talk about the reported £30bn cost of the Sochi Games, so we go behind the scenes to find out where all that money has been spent"
- Radiation Leak at Underground Nuclear Waste Facility in America [07:36 am 11/03/14]
- Howdens Kitchens [01:37 am 11/03/14]
- 3D Magnetic field rotation of light [09:39 pm 10/03/14]
- Repeated Alternator Failure on Power Plant rated 16MVA/ 11,000V using 12x 415V generators [06:53 pm 10/03/14]
- How and when will DECC's electricity capacity market fail? [01:30 pm 10/03/14]
The essential source of engineering products and suppliers.
Tune into our latest podcast