Server 2012 and 2012 R2 introduced the concept and tool of Windows “Storage Spaces”. To be frank, I was royally confused by what they were and what purpose they served until I sat through a 2012 R2 Jumpstart. Now I “see the light” so I hope to be able to share the light with you. (Oooo, I get to be evangelical about something Windows … imagine that!).
Storage Spaces, in a nutshell, is Microsoft’s method of giving you all of the benefits of of SAN’s (iSCSI, FibreChannel, you name it) without all of the attendant costs and headaches. When combined with other Microsoft technologies like SMB3 and clustering, Storage Spaces can give you an 800 pound gorilla from a storage perspective that you could NEVER have had before at such a low price point. The key to all of this is understanding what Storage Spaces are and how they work.
So, what IS a Storage Space??? A Storage Spaces is basically a “group” of storage that you put together based on directly-attached storage (DAS) that the Windows Server can see (preferably on a SAS connection but also works with SATA) that gets sliced up, provisioned for specific tasks and is then shared out. It really is no different in basic concept from what you do with SAN or NAS storage but it is all controlled from Windows and is specifically sliced up to support very specific Windows functions. The really big thing to get your head wrapped around is that you hand over RAW storage to Windows and let it manage things. In other words, if you have a server full of disks or that is connected to a box of external disks, you DON’T RAID the disks on the server controller EXCEPT for the twin disks you will use for the O/S drive! Storage Spaces will manage the RAID tasks for you except for the C: drive; in fact, Storage Spaces CANNOT be used for the O/S drive hence the requirement that you RAID your O/S disks at the controller level (if you want the redundancy and safety of RAID for your O/S drive – and you DO want this, don’t you?). You can add any disks you want into a Storage Space including disks on separate controllers as well as different sizes and types of disks including HDD’s and SSD’s.
2012 R2 Storage Spaces have been enhanced to provide “tiering” capabilities when you build a Storage Space with both HDD and SSD drives. The tiering capability will then manage “hot” and “cold” data automatically and move “hot” data to the SSD’s and “cold” data to the HDD’s, all “on the fly”. In fact, the tiering is smart enough to only move what is truly “hot” to the SSD’s meaning it is block oriented rather than file oriented. This means that if you have space from a Storage Space shared out via SMB3 and have Hyper-V VM’s VHD/VHDX stored on that share, only the actual “hot data” from the VHD/VHDX will be on the SSD! Therefore, a relatively small amount of SSD space can radically speed up a number of server operations as precious SSD space is not wasted on warm or cold data. This is pretty revolutionary stuff when you consider that a RAID card manufacturer like LSI will charge you anywhere from $5,000 to $10,000 for a controller card that does something somewhat similar. (There are a few caveats about choosing your disks types in this case, more on this later in the post).
When you add disks to a Storage Space there is no RAID just yet, the Storage Space just groups together the RAW disks. The Storage Space will list the aggregate of all the available disk space as the space you can play with without any sort of RAID applied. Therefore, if you add two 1TB drives to a Storage Space the available space will display as 2TB. This is also the case when you add in SSD’s as well as HDD’s; the Storage Space at this point is just the big “blob” of disk that you have given it. To use the space you need to create a “Virtual Disk”. When you create a virtual disk you define how much space you want, you define the RAID level you want (mirrored [RAID1], parity [RAID5] or simple [no RAID]) and, if using 2012 R2, you can also define if the Virtual Disk is tiered (if you have SSD’s and HDD’s). Once you do this the “carving up” of the space is performed and a useable “drive” of appropriate size and RAID type is available for sharing or for use on the local server. In theory and in practice, this Virtual Disk is not much different from the RAIDset or virtual disk that you can create at the controller level. The “goodness” comes from the ease at which you can build the virtual disk out of the various building blocks you supply (the disks and SSD’s) as well as from the tiering capability because all of this is “inbox” with Server 2012/2012 R2; there is nothing extra that has to be purchased. And given an appropriate amount of CPU and memory resources on the server, Storage Spaces can do quite a lot with disks and SSD’s attached to a cheap, dumb controller.
Let’s take a look at a Storage Space on my home lab boxen:
Just to refresh things, I have two whitebox servers each with an eval copy of Server 2012 R2 loaded. One of the servers has a bunch of SATA disks attached (a smallish boot disk and three 1TB drives) as well as a single SATA Crucial M500 SSD and it has the Storage role enabled.. This is the box that I am using to demo Storage spaces. The other box has a single 1TB boot drive and the Hyper-V role enabled and I am using it to host Hyper-V VM’s that will reside within the Storage Space on the first host and will be accessed over SMB3 links.
Here’s how the storage looks before I start to build a Storage Space:
The item in yellow is the SSD. At this point I have a 149GB boot drive as well as one of the 1TB drives split up into an E: and F: drive. The SSD and two of the 1TB drives are currently not used. Looking at Storage Pools I see the following:
Note the Storage Spaces and the term “Primordial”, this indicates that a Storage Space can be created as there is unallocated disk space available to be used in a Storage Space. In the Physical Disks pane you can see that there are 4 physical disks available. Disk0 is the SSD ad DIsks1 & 2 are the completely unused 1TB SATA drives (I know this from the previous screen). Disk4 has some unused space available but the disk is not completely unused. I’m now going to build out the Storage Space.
Right-click on the blue Primordial Pool and select New Pool:
Now you need to name the pool and give it a description:
Now you add the disks:
Note how the system knows the MEDIA TYPE and also note the total capacity. As I stated earlier, this is simply an aggregate total of all the disk space that is being added to the pool, it is not necessarily the amount of disk space that will be made available for use.
At this point the Storage Pool is created as can now be seen on the Storage Pool pane:
Note the Capacity and the Free Space for the pool are both 1.93TB and there is nothing showing as allocated. Now we have to “carve out” a Virtual Disk out of the space in order to be able to create a drive that can be shared out. To to this we have to run the New Virtual Disk Wizard from the Tasks drop down in the Virtual DIsks pane:
You need to highlight the desired Storage Pool (I only have one but there could be many to choose from):
This next screen is where things get “interesting”. I can name the disk and give it a description AND I can choose to create “Storage Tiers”. For the sake of example I’m going to go through this piece twice as there is a very important concept that comes into play at this point that you will need to grasp in order to understand what happens in the next few steps. The thing to remember is that when you create a virtual disk you are essentially creating a RAIDset from the building blocks in the storage pool itself. You can create a “Simple” layout meaning data is striped across disks (join them together for more disk space) but without ANY redundancy, You can create a Mirror (two-way meaning you need matching pairs of disks or three-way meaning matching trios of disk) which gives you redundancy in case of a disk failure (think RAID1). Finally, you can create a Parity set using at least three disks (think RAID5). The bigger thing to remember is that as soon as you add tiering into the mix you need to ensure you have enough matching SSD’s to fit the RAID model you are building. In other words, if you are going to mirror two HDD’s and you want to tier them then you will need to have two matching SSD’s in the storage pool that are selectable in order to be able to build the tiered mirror. You cannot build a tiered mirror or parity set with only a single SSD. To see what I mean let’s go through two examples.
Example 1: I will build a 1TB mirror out of my current storage pool that is NOT tiered
On the following screen I can provision space as THIN or FIXED. I won’t go into detail here as it is a whole other discussion but for this example I will pick Fixed.
If you are eagle-eyed you’ll note that I created the space smaller than what is actually available. This is so there is room available for a write-back cache as noted on the screen. I found by trial and error that it is a good idea to leave space for the cache.
OK, I cancelled the Volume Creation wizard. Now the Storage Pool pane shows this:
I now have a 900GB mirror that I can do something with, I’m going to delete the mirror and rerun the above steps and add in tiering. Keeping in mind my note about matching disk availability, given that I have 2 1TB HDD’s and a single SSD, what do you think I’ll be able to create? Let’s see …
Lots happening here! The system is only listing SImple and Mirror as my choices and it won’t let me create a Mirror? Why not?? Because I don’t have multiple SSD’s of matching capacity in the Pool! While I have the required number of HDD’s I’m woefully short on SSD’s so the system has blocked me. If I switch to a Simple layout (no redundancy) I should be able to proceed.
Now the next screen gets interesting!
Wow! This is cool! I can set space used for each tier size. Keep in mind that you are not ADDING SSD space to an overall volume size, you are setting the amount of SSD space to use for “hot data” within the tier. In other words, the Standard Tier (HDD) size will actually determine the overall size of the volume that you are creating. So, If you want a 1600GB useable volume (remember to save room for the cache) then you set that on the HDD tier. I am going to set 109GB on SSD and 1600GB on HDD:
And now I see I have a tiered disk:
And while it does list the Allocated space as 1.67TB the actual useable space will be 1.56TB (the HDD space) as the SSD is strictly for HOT data within the disk. So, allocated space is not the same as useable space.
Would I use a Simple tiered disk like this in production? No way! It could break far too easily (no redundancy). However, it is good enough for lab use and I will blog some more about the whole hot/cold data premise behind the tiering in follow up posts. Also, is it a good idea to mix super fast SSD’s in a tier with relatively slow 7200 RPM SATA disks? In terms of a production system, probably not, as the performance of the tier will be decidedly unbalanced. In a production environment you would probably want nothing slower than 10,000RPM SAS HDD’s mixed in your tier with your SSD’s in order to have a reasonable balance of performance. However, for a lab test system my tier is “good enough” to prove out the concept.
I hope this post helps with your basic understanding of Storage Spaces in Sever 2012/2012 R2. As usual, I’m only scratching the surface but I hope it helps you to see the possibilities of what you can do with this “in box” feature of Server 2012/2012 R2.
Very instructive. Thanks a lot – the guys at technet could learn from this!
han
OK, now I’m blushing! 🙂 Thanks for letting me know the post was helpful.
Very Good understanding about Storage Space.
Thanks! Hope it was helpful.
Thank You for this informative entry. I’m thinking about building it for my company as a cheap alternative to san. Could You provide some benchmarks of this setup ? I wonder how big the difference would be between for example ssd + sata vs ssd + sas
Hi! I don’t have benchmarks available but I’m sure that you can find them online. Certainly everything I’ve read ands seen (lots of info inside MVA) indicates smoking performance. And you must understand that if you are going to do thins you need to run very fast disk in order to not end up with a massive imbalance in the storage pool between the SSD and the disk. My test bench is all SATA and it is OK for a test but in production you really need to be SAS and you need at least 10K RPM drives, 15K are preferred. Keep in mind * that the Tiered Storage Pool presents ALL of the space (SSD + disk) as one “object” to whatever system is accessing it. A big imbalance between speeds and feeds of the underlying components will just end up making a hash of the desired performance. In the end, with a proper design you’ll have incredible performance at a cost significantly lower than that of a SAN and you won’t have the admin headaches of a SAN and an iSCSI or Fibre network. But this is not a way to build a super-cheap alternative to a SAN; there is a cost to building it out correctly.
At last!
Straight to the Point, pedagogic and screenshots 😉
Thx for sharing!
i like it, greate job.
ThankYou
thank you for the information, i’ve learned a lot from your posts and they helped me a lot. can you please differentiate between storage pools and storage spaces please.
Hi, Ronny! Glad to be of some help.
A Storage Pool is a chunk of storage that you have built out of the raw storage that is available on your system for whatever use you want to put it to. So, if you think of the disks as the lowest building block, a Storage Pool might be the bunch of disks that you have in the box or, at least, the disks that sit behind a single disk controller. This means there are no filesystems or anything else on the disk, they are empty and raw. In my post I talk about the empty space being available in the Storage Pool and the fact that the space is “primordial”. This means that the space is available for you to create a Storage Space. A Storage Space is a chunk of space that you carve out of an available Storage Pool and then let Windows do all sorts of fancy management with that space!
Other vendors might use the term Storage Pool to define something that would be similar to the Windows Storage Space; I have seen a vendor or two call the RAID space created behind a smart disk controller a “Storage Pool”. So I can see your confusion.
So, once again … Storage Pool is RAW disk, Storage Space is Windows-managed space.
Hope this helps!
Robert
Great walk through!! I do have a question once everything is built. I need to add for SSD drives to my Tier. Right now, I have 2-256G drives in my Tier Storage and I just bought 2 512G drives. I have put them into the Tier and everything appears correct. However, if I manually running the Optimization (Defrag d: /g /h /#) it states that Current size of the faster (SSD) tier hasn’t grown.
Travis: I haven’t had the opportunity to change the sizes of SSD’s or add additionals so I’m not able to speak to this from experience. I’d suggest going back and rechecking your settings and ensure the devices ID’s correctly. Id also check the Tier specifications as I’m betting you have to explicitly tell it to expand into the additional space. It seems to me that it does NOT auto-add new space.
Hopefully this helps a little bit!
Robert
Thanks for the post, I do have a question. I have already setup my storage pool (server 2012 without tiers) with 4x1TB SAS drives and 1 hot spare, I am now needing to add more drives/space to my pool and am ready to buy new drives. The current setup is that all my data is in a 2 column setup. If I buy 2x2TB drives, will that increase my pool by 2TB or just 1TB because of my original hard drives? If I want to increase by 2TB, do I need to buy 4x1TB or will 2x2TB work the same?
Storage pools do work on basic RAID principles. While I can’t be 100% sure as I don’t have a lab machine with disks available for testing, I believe you need to scale out using similar drives to what you have now in order to keep the columns the same. I *think* you might be able to add another column using the larger drives but, again, that is conjecture on my part. Again, remember that in the end you are dealing with software RAID and RAID always likes similar size drives; it generally will NOT use the “additional” space of larger drives when those drives are combined with smaller existing drives.
Hi,
This is a good post but it is incorrect to refer to Storage Spaces as RAID. RAID operates at the whole disk level whereas Storage Spaces is block based.
This is a distinction worth making as it underpins much of the uniqueness of Storage Spaces when compared to prior software RAID in Windows.
To answer the last question, a virtual disk requires new physical disks to be added in a matching number to the column count. For a 2 column virtual disk, the storage pool will need a further two drives added to allow it to be extended.
Thanks for pointing this out. I’ll still use “RAID” as the generic description but the distinction is important and noted. It is one of the reasons that Storage Spaces is so versatile.
Robert
Thank you for the usefull post.
Is it possible to use 2x SSD (say 512Gb) and 4x 15k SAS HDD (say 600Gb) and create a mirror set? Or do need 4 SSD disks then?
You bet, this is pretty much what I describe in the post. The SAS HDD’s will create the big pool of storage and then the SSD’s will overlay for the “fast” piece which allows hot data to bubble up into the SSD’s and cold data to bubble out to the HDD’s. The thing to keep in mind is that you require a minimum of two HDD’s and two SSD’s to create the hybrid storage space as each type of device needs to be mirrored. Going past this you can start to get crazy with the Column count which translates into the number of devices you use.
Robert
Hi!
Great article, one of many that I have read, and it definitely very informative!
Question regarding caching:
I noticed on your second example, you used up “Maximum Size” for SSD & used 1600 of 1860GB on your HDD saving for cache.
I thought Write Back cache would have been/is allocated on SSDs and not necessarily HDDs?
Therefore would we instead save space on the SSDs and not the HDDs?
Thanks!
Ed
Hi, Ed:
No, in a Tiered Storage Space the write-back cache is NOT created on the SSD’s (weird, I know, because you would assume the cache would be on the fast disk). The SSD’s in the tier just provide the overlay of “hot block storage” and you actually don’t have much direct control over them other than being able to create the tier itself. As I said in my post, I had to play around with sizing to find the max size I could allocate, the rest of the space is the write-back cache. It’s possible that some of the data in the writeback cache could “migrate” up to the SSD’s, I suppose, but I cannot say for sure. Keep in mind that the SSD layer is really for “hot data” and not a layer of additional “space”. You can run commands to flush all of the data out of the SSD layer and back down to the HDD’s; this would be a disaster if you didn’t have the free space!
Hope this helps somewhat. As I said in my post, it took awhile for me to get my head wrapped around the whole architecture and I still don’t admit to understand it fully.
Robert