Storage Reliability

Recently I’ve written about storage strategies designed to future-proof access to your files. Other than questions of whether future software can still play your files, the biggest issue is whether of not the media is playable at all in a number of years. Unfortunately, there are simply no guarantees. All media can and does fail. Let’s look at various answers.

Everyone touts “the cloud” as the ultimate solution. Although cloud-based storage space is relatively cheap, the cost and data charges for massive uploads and downloads along with local internet speeds pose the stumbling blocks. There’s very little in the near term to change that. Remember, too, that cloud storage is a subscription service than never ends if you want to keep that media in the cloud.

The LTO (Linear Tape Open) data tape format is considered the “gold standard” for physical back-up and retrieval, but it’s really a format designed for long-term industrial and financial data applications. In other words, back it up once and forget it unless you need to restore from a backup tape in the future.

While many studios require original camera footage for major feature films to be archived onto LTO, the format doesn’t fit well into the needs of most small-to-medium production companies and post houses. There are three reasons for this: 1) As file capacities grow, LTO barely keeps up in equivalent capacity and transfer speeds. 2) The LTO standards keep evolving with limited forward or backward version compatibility. 3) If you need to continually go back to your archive to revise and update older projects, the linear design of LTO isn’t very attractive. In addition, frequent shuttling back and forth on LTO tapes to retrieve materials from random sections of the tape will cause an LTO tape to prematurely fail before its rated life.

One alternative to LTO is Sony’s Optical Disc Archive. It’s essentially a videotape deck-sized unit that records on writeable optical media (like a Blu-ray disc). They offer a robotic juke-box type of system for automated retrieval with large library systems. It’s a robust solution, but is mainly relevant to large facilities, such as at broadcast networks.

Storing on a large, RAID-protected array is a good, short-term idea, but it won’t be very cost-effective as your storage needs mount. I don’t recommend small 2-drive or 4-drive RAID enclosures for extended storage. These are more likely to have the RAID structure (whether hardware or software) fail and leave you will nothing accessible on that array. In my experience, single, enterprise-grade drives are more reliable. I buy these as raw drives (so I’m not paying extra for a power supply and interface with every drive) and mount them in a drive dock when I need to use them.

Hard drives do carry a manufacturer’s warranty for a rated lifespan, but I will reiterate that there are no guarantees. A 3-year-warranted drive may last as long as a 5-year drive and either one could fail in one year or last 10 years or longer. I currently have some drives that are as old as that. With drive failure is always a looming possibility, the reasonable strategy is to maintain multiple copies of any media of value. Three duplicate copies is recommend.

Let’s address how to select the drive to buy. Most of these types of drives come in several speeds and warranty levels. 5400 or 7200 RPM are the normal speed offerings. Both are fine for archiving, but 7200 is preferred if you occasionally need to edit directly from them. Warranties are usually three or five years. As with any physical media, it covers the replacement of the product, but not the value of the data stored, which you may have permanently lost.

A warranty is like life insurance. A 3-year drive isn’t necessarily better than a 5-year drive. The company has developed actuarial tables that tell them statistically enough of the  5-year drives last to the 5-year mark, so they won’t lose too much money by replacing the few drives that do fail. Sometimes the difference between three and five years may simply be that drives tested with more minor errors end up in the 3-year pile, while the ones with fewer errors go into the 5-year pile. I haven’t looked into the manufacturing specifics too deeply, but that’s generally how product warranties work.

With those two criteria in mind, I usually purchase 7200 RPM enterprise-grade drives with 5-year warranties. These are drives intended to be used in servers and shared storage systems running 24/7/365. There has been a lot of consolidation in the hard drive business, so regardless of the brand name, there are really only a handful of companies manufacturing the media.

One source to track which drives to buy is Backblaze. They are a cloud provider that publishes their testing results, based on a current pool of over 100,000 drives that they have in operation. Right now the front-runners are ToshibaHGST (Hitachi enterprise) and Seagate. The HGST brand has been absorbed by Western Digital. All these are good options. I also hold back on the largest drives rather than be on the bleeding edge. For example, you can now purchase 14 TB drives, but I’ll tend to stick with 8 TB for a while.

Mechanical hard drives are meant to spin and not to sit on a shelf indefinitely. Periodically load each drive into a dock and spin it up. Make sure the contents are still retrievable and files can be opened. This process should happen no less than once a year. More frequent is even better. And yes, if you have 100 drives in your archive, don’t get lazy. This needs to be done. If a drive sounds odd, has difficulty spinning up or mounting, or has lot of vibration, then clone and replace it ASAP, because it’s likely to fail soon.

Many spinning drives and solid state drives employ S.M.A.R.T. technology. This is a prediction of drive failure. Diagnostics fail the S.M.A.R.T. test when they determine that enough sectors on drive are no longer writeable. Other drive issues, like excessive heat and slow spin-up can cause errors. The drive may outwardly act and seem fine, but it’s time to clone and replace the drive. Shared storage servers monitor for S.M.A.R.T. errors in their RAID drives, but you can also get some diagnostic applications to test individual drives.

The final level of security is to develop a plan to routinely transfer your entire library to the current format of the day. If you use hard drives, then plan on migrating your library to a replacement within five to ten years. Many feature film operations, like ILM, have done that for years, because they sit on a library of material with a ton of value. Your media files, might not be that, but this should be a strategy you follow to future-proof your production investment.

©2019 Oliver Peters