Is RAID 1 an Outdated Way of Thinking?
As the price of flash storage has come down over the years we thought it would be worth wondering out loud if there was a better alternative to the old RAID 1 setup. I want to predicate this article by stating no matter what hard drive configuration you ever choose you should always, always, always maintain an off-site backup of your data. I will also state that we have deployed over 10,000 custom configured servers over the last 10 years and although we offer numerous RAID configurations, many people do not add them and although we offer automated backup solutions, many people do not pay for that either and although we always recommend our customers keep their own backup, many people do not. Lastly, I will point out that we offer fully managed servers and self-managed servers. It is our experience that those self-managed customers do not always use the tools available to them to check on the health of their RAID arrays allowing them to get ahead of failures. So this article is meant to discuss what the best bare minimum acceptable server configuration is with today’s technology.
For years two spinning disks in a RAID 1 mirror configuration has been the go to setup for people wanting a bit of comfort from data loss without breaking the bank. With the advent of the much more reliable SSD (Solid State Drives) should this still be our go to better-than-nothing hard drive configuration? Since SSDs are considerably more reliable than spinning disks would simply having an SSD as the primary drive with a 2nd SATA3 drive setup for local backups be a more reliable, better performing, low cost solution?
To answer the question we must first understand the benefits and details of each option.
SSD (Primary Drive) with SATA (For Nightly Backup Storage)
Due to falling costs, excellent reliability, and super-fast reading speeds, SSDs are an increasingly popular storage option for servers. What we are proposing as an alternative to RAID is to have an SSD as your primary coupled with a SATA3 drive mounted for backups. Then it is just a matter of a running a simple cron job each night to backup your data to that SATA disk. While SSDs are more expensive than SATA, their price continues to drop narrowing that price gap. Additionally, this configuration drops the expense of having a RAID card pretty much bringing the drive costs to parity and resulting in the server costing about the same per month at the end of the day.
SSDs offer significant performance improvements over RAID1. SSDs average transfer speeds of up to 550MB/s, while HDD offer average transfer speeds of up to 180MB/s. The MTBF (Mean Time Between Failure) of an SSD drive is 2 million hours, while spindle hard drive’s MTBF is ~1.5 million hours.
RAID 1 Reliability & Performance
RAID 1 (or mirroring) is a simple solution for reducing the risk of data loss. While RAID is not a backup solution, it is an insurance policy that if 1 drive fails you have better odds of retrieving your data from the remaining operational hard drive.
In a perfect world, when a drive fails in a RAID 1 array you simply swap out the failed drive with a new one, rebuild the array and all is well in the world again. This doesn’t always work, however, and sometimes the data on both drives gets corrupted, and you end up losing your data.
Two spindle hard drives in a RAID 1 array will be slightly faster than a single spindle hard drive, but nowhere near the performance of today’s SSDs. Spindle hard drives have a much higher failure rate, Spindle hard drives experience an annual failure rate of ~5%, while SSD enjoy a much lower annual failure rate of ~1.5%.
Another issue with RAID is if an add-in RAID controller card is used. This introduces another point of failure. While most RAID cards will last a lifetime, approximately ~3% fail per year. In the event of a RAID card failure, you run the risk of your data being corrupted or entirely lost. Using on-board RAID is risky as well, if the mainboard fails, you risk losing the array, and your data. Either one will likely end up causing you a lot of downtime and misery.
So Which is the Better Option?
SSDs offer undoubtedly superior dedicated performance over RAID 1 using SATA or SAS spinning disks. So the performance question is really no contest.
Now let’s revisit the reliability numbers quickly;
- SSD annual failure rate- ~1.5%
- SATA annual failure rate- ~5%
- RAID card annual failure rate- ~3%
It seems the question of reliability is also undisputed. When SSDs are backed up nightly to SATA, your data is safer than if you utilize two spindle drives in RAID1. A few things to note: SATA and SAS often begin to fail gradually with a decline in performance, while SSDs could be prone to sudden failures without prior warning in certain environments. So while an SSD dies a sudden death with no warning, you at least have that backup drive to restore from. While spinning disks often die a slow death with warning signs that the end is near, during that gradual death cycle the symptoms of the failing drive(s) might very well lead to data corruption on both drives. Under this scenario, you are restoring from a backup (hopefully) anyway. Additionally, if one of your drives in a RAID array is failing, it raises the likelihood the other drive will also fail soon. SSDs can survive extreme temperatures as well as increased wear and tear due to a lack of moving parts. Spinning disks can become damanaged from the slight vibrations or jostling during hands-on maintenance. Our conclusion is when used in a secure environment, SSDs are more likely to enjoy increased reliability over SATA RAID 1. This is reflected in their mean time between failures of over two million hours. When used in conjunction with a SATA backup, the risk of data loss is significantly less than when using RAID 1 alone.
5 thoughts on “SATA RAID 1 vs SSD + SATA, Which Makes More Sense?”
I’ve been wondering about the comparison of RAID card failure vs newer SSD failure. For years, I’ve used RAID 1 on a pair of HDDs, but now it seems that the RAID card itself introduces a failure rate in excess of SSDs.
Singer SSD is better 2 HDD RAID 1
We recently had 2 backup servers that comprised our backup solution. They were put online within a day of each other and both had SSDs in a RAID 1. One drive failed on one then 8 hours later the other drive failed before I could replace the first drive. We completely lost that location. Oh well atleast we have one location working. Three days later the first drive failed at that location and just a couple hours later the other drive failed. We were hard down for two months with no way to backup or restore. Luckily the data volumes were mechanical HDDs so we did eventually get the data back. Ive known other people who have had this happen. SSDs have an exact number of writes. Putting them in a RAID 1 makes no sense because they do not have the bell curve failure rate mechanical disks do. If you have done this not thinking (like I did), go ahead and pull out your redundant RAID disk and replace it with a new one to stagger the failures and save your sanity. You can argue theory all day or so that you are a very lucky person or whatever but this is real life. SSDs do not have the same characteristics as the old spindle disks so we need to stop using old safeguards for new technologies.
SSD also need to be overprovisioned by a certain percent, to increase their lifetime.
Basically only format it to 80% capacity and it will last 20% longer. Even better, 60%.
SSD’s are really only useful, imho, in the RAID world as a caching drive, rather than as a primary. They will die quicker.
I’m retired now but used to provide servers to my customers. Servers were always either RAID1 for smaller customers or up to eight drive RAID 10 arrays. Sometimes software RAID 1 (ecch! because of budgets). Always had backup drives and overnight backup copies to an internal backup drive plus multiple daily backups to be taken offsite. We never lost data.
After retiring, my own server, consisting of two three drive RAID 5 arrays (primary and backup) started to get corrupted data on both sets of drives. I suspected the RAID controller. I got through to the MegaRAID (PERC, etc.) engineers and from the symptoms they said the memory in the controller was bad. I asked how to run a memory test. Turns out there was no memory test! Worse, they did not even use ECC Memory!!!
I replaced the primary drives with an enterprise class Micron SDD and just a fast Barracuda Pro backup drive. The machine was way faster of course and backups took a little longer but they were overnight, so it wasn’t critical.
I recently (last year) upgraded my server to a 4TB Micron 7300 PRO. It has internal RAIN (similar to RAID) with a heck of a lot of overprovisioning already. It doesn’t really make sense to put in a pair of highly reliable SSDs that have to run under crappy controllers. It’s one thing for your drive to die all of a sudden. Its absolutely much worse for a RAID controller to slowly destroy little bits of files for days or weeks before anyone realizes it, then try to find all the backups for them, and rebuild them one by one. I was lucky I had multiple backups from days, weeks, and months before to restore from.
So I’m glad to see these failure rates put into perspective. It confirms I made the best choice for me. I still keep multiple backup copies though.
Those that have to use RAID to ensure/hope data entered during the day is not lost, now have a difficult choice.
Thanks,
Bob.