Avoiding the Infrastructure “Blame Matrix”

    Avoiding the Infrastructure "Blame Matrix"
    By Matt Swinbourne, systems engineering manager, NetApp

    Whilst working for a University some years ago, we had a "problem app." This problem app used to have inexplicable slow-downs at random times throughout the day. We chased that problem all over the place to try and figure out what it was. We met to discuss it so much in fact, that we created what I call a "Blame Matrix."

    The Blame Matrix was used to try to figure out where problems were, and it consisted of representations from all the layers in our infrastructure. The typical meeting would involve the virtualization engineer, storage engineer, network engineer and security engineer each insisting the issue was not their domain and not their fault, with no conclusions ever reached. Sound familiar?

    Now, at least from a storage perspective, we've seen new innovations come to the market that have the potential to bring an end to the Blame Matrix. This is especially true of Flash technology.

    Because of the nature of mechanical storage devices like hard disk drives, there is an inherent latency for any request coming in or out of the disk. This latency combined with the latency of the network delivering the application, the time taken to process the request in the server, and then render the response on the end user's device is the actual "perceived performance" of any application by the end user.

    In the past, we have architected data storage systems to have hundreds or thousands of hard disk drives to get an aggregated performance to decrease the part of this latency that is introduced by the data storage solution. Then, along came Flash. One Flash drive, SSD, EFD or whatever you'd like to call it, is roughly the equivalent of 32 normal spinning disks.

    While computational power has drastically increased, hard disk drive performance has never followed the same trend. In fact, it has been doing the opposite, because disk storage density has been doubling every two years. This means hard disk performance has actually been halving every two years when measured per gigabyte, and we need a new storage medium to keep up with the explosion in computational power. As it happens, the finger of blame has been squarely pointing at the disk array for some time.

    What sort of Flash should we be considering? There are two approaches: hybrid Flash or all-Flash.

    Hybrid Flash solutions, which comprise of mostly spinning hard disk coupled with some Flash, can offer high capacity per dollar, and moderate to high performance per dollar. This solution is perfectly suitable for the majority of enterprise workloads. The specific proportion of Flash in the mix varies according to the application requirements, the amount of active data and the size of the working set, but a small amount of Flash goes a very long way.

    Hybrid solutions offer an excellent economic balance between capacity and performance, so long as they can provide performance exactly when it's needed and no additional burden is placed on the storage system administrator. If there is additional burden on the administrator, then the benefit is lost due to the requirement for a human to be "in the loop" of storage performance. The best hybrid flash solutions will move hot data into Flash immediately, because in many enterprises, a delay of even 15 minutes can be incredibly costly.

    Alternatively, All-Flash solutions offer low to moderate capacity per dollar, and very high performance per dollar. This is the solution reserved for those "special" applications that seem to constantly require more and more performance. Typical examples are ETL processes like data warehouses, reporting servers, or core applications like CRM or ERP. In this category, latency is paramount. All-Flash solutions tend to be sized around the capacity requirements of the specific application to ensure latency is not an issue.

    As cloud takes hold of the IT world, many enterprises believe they will never need to buy a storage array again once they move to the cloud. It's understandable that infrastructure managers dream of a world where they will no longer need to worry about storage. However, Flash also has an important role to play in the world of cloud. The move to cloud-based infrastructure is undoubtedly a positive thing, but even in the magical new world of cloud, there will still be IT infrastructure on-site. The reason for this is simple; the closer our data is to our devices, the better the performance and the lower the latency. This is where Flash (and some other technologies under construction) come in. Imagine applications and services that push their working set of data to a Flash device that sits less than 1 millisecond away from your device, with the capacity tier off in "the cloud" somewhere. It's takes hybrid flash to another level.

    At NetApp, we're passionate about helping organisations reach the bottom of the Blame Matrix when it comes to that "problem app", and we believe Flash is already shaping up to be on of the key technological trends that will make this vision a reality. Talk to your local NetApp representative about how flash can enhance your cloud, or remove the problem, from that problem app.

    BIO: Matt Swinbourne, systems engineering manager, NetApp Australia and New Zealand

    Matt Swinbourne is the Technical Sales Manager for NetApp's QLD, WA, SA and New Zealand districts. He has been involved in the IT industry for the past 17 years with a background in engineering, solutions architecture and consulting across a variety of verticals and technologies.

    Since joining NetApp in early 2012, Matt has designed infrastructure solutions for high performance computing, mission critical computing, public and private cloud, and large-scale content repositories. As a systems engineering manager, Matt works with his team, customers and partners to develop innovative solutions for today's rapidly changing data management world.