The Virtualisation journey

I started out in IT straight from school as a trainee ICL VME operator.  VME was a mainframe system consisting of large cabinets of hardware produced by ICL.  I joined a company who had just got rid of their punch card process – it still sat in the corner but I never actually saw it in action.  The terminals were green screen with light green text and looked like a CRT monitor that we knew before Flat Screen monitors became the norm.  It had an on and off switch and a keyboard attached.  You turn it on and it loaded instantly.  There was no mouse and everything you did involved command line text, the syntax you had to learn.  The job involved running jobs or scripts to produce printed output on green and white ruled paper or cheques or other pre-printed paper.  The jobs/scripts would run for hours to generate the output, and involved loading large 24 inch reel to reel tapes to load in the data or to make backups.


Picture courtesy of IBM

Loading the tape took a bit of skill to line it up.  It later moved to the smaller cartridge (as in the above shot).  This held more data, required no skill (other than to put it in the right way round – though it would only go in the correct way).  They were even housed in a huge (StorageTech) robot which loaded then with a robotic arm.

The mainframe computer took up the size of a 5-a-side football (soccer) pitch and cost millions of pounds.

The users connected to the mainframe via their “dumb terminals” to interact with the hardware to get the information necessary to complete the tasks required of them.

Big, blue-chip companies worked like this for their entire business process.  Finance, manufacturing, personnel, it even had it’s own email system.

Then the PC came along, with it’s own operating system in Windows 3.1 and terminal emulation software was used to connect to the mainframe. This replaced the dumb terminals.  Now the employees used a mouse and could play solitaire or create pictures with MS Paint while the mainframe jobs ran their course.  Of course the PC had word processing and spreadsheets, using Lotus 1-2-3 software, so more could be achieved while sitting at the system.  However, the lowly 286 PC could not process and create 10,000 cheques a night or do anything close to what the mainframe was capable of.  It was simply a means to connect to it.

However, I could see that the PC was the future and managed to get out of mainframe operations into a PC support role.

The PC has been through quite a lot since then, though the common OS has gone from Windows 3 to 7 and now shortly Windows 8 – its role has not really changed all that much.  It is the PC server that has grown up in a more significant way to the extent where groups of servers are now capable of replicating the mainframe environment that was prevalent in the 1970’s through to the 1990’s.  So much so that they are now powerful enough to host the operating systems and the PC becomes unnecessary.  Thin client systems which only have a Keyboard, Video and Mouse (KVM) connectors, along with the network port are now becoming common place in the SME market.

We have returned to a central system running on the server, with a “dumb terminal” on the desk of the user.  However, the options available are far greater than those golden days of the green screen.  Businesses can now deploy entire systems at the click of a mouse – they can spend less money and achieve a whole lot more.

Traditional Rollout process

Take a typical rollout process of a company of 10,000 people.  Following the infrastructure build,the actual rollout would involve a lot of people developing a Common Operating Environment (COE), installing it onto new hardware, backing up current systems data, actually box shifting the equipment to each and every desk then running through the gremlins that inevitably occur following such a process.  Three to five years later will go through it all again.

Once the systems are rolled out there is the need for a call center, 2nd and 3rd line support, field services and hardware replacement costs through failure.

Cheaper solution through virtualisation

The new way is not only cheaper in hardware costs, but in personnel.  Thin client systems that sit on a desk are available for as little as £100  (even less in some cases).  They require no maintenance or upgrades.  They are a “dumb terminal”.  The Operating System runs on the server, using software from companies such as VMWare or Citrix, they host the user’s OS through hypervisor systems such as ESX or XenServer which run on powerful servers and make the server’s hardware available to the many guest Operating Systems running on them.

Installation is just a matter of unbox and plug in.

 

Above shows a very basic server which hosts the Hypervisor software.  This can host multiple guest Operating Systems, the number is dependant on the host’s resources (RAM, Storage, Processors). 

Most Host systems will have more than a single network connection, as well as the faster options such as Fibre Optic connections through Host Bus Adapters (HBA).  These hardware network connections are available to be used by the guest Operating Systems as well as being able to create virtual connections that allow guests to communicate with each other.  Virtual Guest operating systems can even be moved while they run from server to server through a dedicated network (this functionality requires it’s own network connection and VLAN).

Virtual Storage Solutions

In the above basic example, the host Hypervisor server is shown with only an internal disk.  Considering the largest disk available is 4TB (4000GB) – You need at least 1GB for the Host OS.  Installing guest Operating Systems with 100GB each would give approximately 39 Guest Operating systems on the single server, due to the space on the disk.  Though this would push the server to it’s limit due to storage, of course if the single server failed it would take all guests down in one go.

A robust virtualised solution would therefore look to have a mirrored system (either Active/Backup or Active/Active) as well as using external storage from the host server.  The host system’s storage would hold the local Host Hypervisor OS (ESXi or XenServer etc) and the guests be held on a SAN.

The Storage Area Network device contains storage using either purely spinning hard disks, Solid State Disks or a combination of the two (more on that later).  The disks within the SAN can be configured in a RAID configuration or Just a Bunch of Drives (JBOD).  They are presented to the Hypervisor server as a Logical Unit (LUN) in Fibre Optic connected systems or as connected drives in a JBOD setup.  The connection between the server and SAN is achieved using either 2GB Fibre Optic cabling (Fibre Channel) or using  1GB Ethernet copper cabling (iSCSi).  I won’t go into the argument between the 2 technologies however check out http://features.techworld.com/storage/3231982/fibre-channel-vs-iscsi-the-war-continues/ for a good description of them.

Once configured the host Hypervisor server or servers can store multiple guest hosts to the maximum extremes of its hardware capability.  Carve up the resources of the host to allocate to the guests.  It’s a simple mathematical equation of RAM and storage.  How much resource does each virtual guest require to run successfully?  Divide this between the amount available to the host and you have the number of guests available on each host server.

This brings us back to the need for the most powerful host servers.  Today’s servers are small in physical footprint, in comparison to the old mainframes.  Link them together in racks and even clusters (linked servers where failures are seamlessly accommodated without disruption) and the power and performance, once available in football pitch sized computer rooms are available in a single rack.  This technology is faster and cheaper than ever before.  The revolution of Moore’s law has allowed chips to get more powerful in smaller and smaller physical sizes.

IOPS

So now we have the fast host server, the bottleneck is IOPS (Input Output Per Second).  In short, this is getting data off a disk and writing data to it.  The host now has multiple core processors, fast bandwidth capable networking.  The problem lies with the hard disk.  In order to achieve ultra fast IOPS with spinning disks, multiple disks in large RAID Arrays are required.  A typical 15k SAS spinning hard  disk can achieve 200 IOPS.  A typical Microsoft Operating System (Windows 2003 server or Windows 7) requires at least 40 IOPS  in order to run comfortably.  A good basic blog on this is available at IOPS: Performance Capacity Planning Explained and the pertinent text from that page is:

  • Microsoft Exchange 2010 Server
    Assuming 5000 users that send/receive 500 emails a day, an estimated total of 3000 IOPS is needed
  • Microsoft Exchange 2003 Server
    Assuming 5000 users are sending 60, receiving 150 emails a day, an estimated total of 7500 IOPS is needed
  • Microsoft SQL 2008 Server, cited by VMware
    3557 SQL TPS generates 29,000 IOPS
  • Various Windows Servers
    Community Discussion: between 10-40 IOPS per Server
  • Oracle Database Server, cited by VMware
    100 Oracle TPS generates 1,200 IOPS

Therefore to achieve 7500 IOPS (taking the MS Exchange 2003 Server example listed above), a RAID array containing 38 disk drives would be required (based on the 200 IOPS per 15K SAS disk).  The amount of storage is almost irrelevant.  38 x 72GB drives give 2.7 TB of storage.  This is possibly more storage than is needed  for the Exchange software, but the number of disks are required to give the IOPS performance.  For the SQL 2008 Server – 145 drives are required, using the above figures.

Here we can see why storage costs have risen in the fact that so many drives are required in order  to provide the performance required of the software running on the hardware.

Solid State Disks to the rescue

We have explored the problem of the need for high IOPS from storage technology.  Here is where the Solid State Disk (SSD) solves that problem.  SSDs are achieving as much as 80,000 IOPS in a 4k Random Read and write measurement (latest SATA III MLC Drives).

This technology enables less drives to be required to achieve the performance, though SSDs are currently more expensive than it’s spinning disk cousin, solutions can be built using less disks.

SLC or MLC

Single Layer Cell (SLC) based SSDs have always been labelled Enterprise drives due to the equation of writing 500GB of data every day to drive will cause a write wear issue, where the drive will fail after 10 years.  Multi Layer Cell (MLC) based SSDs have been labelled Consumer drives due to the equation of writing 200GB of data every day to a drive will cause a write wear issue, where the drive will fail after 6 years.

SLC drives are generally limited to 128GB (there are 240GB SSD sized drives, shortly to come on to the market).  MLC drives are now commonly available at 480GB and even 960GB sizes.  Using the above equation an application writing to a 480GB in it’s entirety every day will cause the drive to fail after approximately 3 years.  In reality though, most software applications will not behave in such a way – there would be a mixture of Reads and Writes.  Analysis of an application’s storage behaviour is required before selecting which technology is suitable.  It can make choosing a storage solution a complex process.

New Breed of Storage Devices from GreenBytes

All major storage companies in the market, have recognised this new technology in Solid State Disks.  EMC, NetApp, HP and Dell all offer SSD based storage devices either fully Solid State Storage or a new hybrid.  The hybrid uses traditional spinning disks as the medium to store the data and uses an SSD solution to act as a read/write cache device.  To the host server environment, the SSD is presented as the front end giving the fantastic IOPS performance.  The SSD reads and writes at amazing speeds, and then trickles the read and writes down to the storage behind it.  This allows a cost effective solution without using a fully populated Solid State device.

GreenBytes HA-3000 High Availability (HA) Inline Deduplicating iSCSI SAN

GreenBytes have produced a revolutionary storage device called the HA-3000 which is a  High Availability (HA) Inline Deduplicating iSCSI SAN. This device uses 2x Solid State Disks  to store an intelligent cache as well as the storage medium for metadata and the deduplication tables.  The data is stored on  either 2TB or 3TB 2.5 inch SAS drives achieving between 26TB and 39TB storage and as much as 150,000 IOPS.  All this from a single 3U device.  A single expansion shelf can be added to each HA-3000 (doubling it’s capacity).


Greenbytes H-3000 High Availability (HA) Inline Deduplicating iSCSI SAN

Features include:

Dual Controllers

These Xeon based controllers can operate independently in an Active/Active configuration giving high availability features that today’s Virtual Desktop and Virtual Server environments demand from an iSCSI SAN

Redundant Networking Capability

Equipped with 4x 1GbE and 2x 10GbE network ports, gives the HA-3000 a fully flexible connectivity and fully redundant capability.

True Hybrid based performance

The Greenbytes HA-3000 has an intelligent dual SSD based cache layer, giving low latency performance without the current high cost of SSD.  Having a flexible virtual pool and thin provisioning feature, the system can be configured to use additional SSD in the cache layer to accelerate performance.

Inline Deduplication

Deduplication is a method of storing a single instance of a file that is commonly used across guest operating systems hosted on a Hypervisor platform.  This allows a huge saving in the amount of data stored on a storage platform.  This can apply to operating system based files as well as software applications

For example a 1MB email attachment sent to all recipients in the address book would normally mean that the Mail Server software would store multiple copies of this attachment.  Deduplication allows a single instance of the file, with markers pointing to the file being stored within the emails containing the original file.

Above is a very simplified example of where storage savings in deduplication can be made.  Storing a single version of a file (notepad.exe) once, and storing only a pointer to the actual file in subsequent instances held on the device.

The calculation overhead related to the lookup and processing of this translation of data – Read pointer, lookup single file instance location, open the file – requires an additional  overhead on the storage, however, as this data is held on the Solid State Disks within the HA-3000, access to this data is very fast. The Greenbytes HA-3000 is specifically designed to be optimised for this function, which is key to how the amazing performance  is achieved.

Inline Compression

This functionality allows the Greenbytes HA-3000 to store more data than it’s physical size.  A typical compression ration of 10x theoretically expands the device to store 10 times it’s physical storage.  A 26TB SAN can therefore store 260TB (Yes! 260TB from a 3U device).  Inline compression and deduplication functionality expands the storage beyond it’s physical footprint.

Enterprise Management and Features

The GreenBytes HA-3000 includes a built-in snapshot and replication management software package which requires no additional licencing costs.  This allows the system to be configured with Active/Passive or Active/Active access to the storage pool(s).  When configured as Active/Active with the optional HA-3000 storage expansion unit, the controller design accelerates achievable IOPS, which broadens the range of target storage applications within the single product family.  The HA-3000 can easily be paired to work with other GreenBytes SAN systems (including the GB-X Series), all from a single screen software.

Performance
The HA-3000 has performance capabilities are significantly greater than other iSCSI SANs in the same price range. With GreenBytes’ Hybrid Storage Architecture (HSA), an intelligent SSD-enabled cache layer accelerates read and write IOPS for the most demanding of virtual infrastructure projects, including the largest enterprise VDI installations. 

Scalability
The HA-3000 series offers a wide range of capacities aimed directly at the challenges of virtual infrastructure projects and other business-critical applications requiring highly available (HA) SAN storage.  The HA-3000 has raw capacities ranging from 26 TB to 78 TB. With additional optimizations of deduplication and compression, the actual capacities of the systems are typically much greater, especially in the area of desktop virtualization.

Buy GreenBytes High Availability (HA) Inline Deduplicating iSCSI SAN from Future Storage

Post a Comment

Your email is never shared. Required fields are marked *

*
*