ARCHIVING DATA

One of the challenges in the digital era is coping with the EB of data that are flooding servers almost daily. NASA had one of the biggest headaches. NASA historically used magnetic tape in vast amounts. Sadly many of the racks of tapes were damaged and could not be read. One report we saw spoke of a 75% failure rate.

Over time hard disks exploded in capacity and prices fell exponentially. Come 2016, Seagate now offers a 10TB hard disk to consumers.

Tape has improved in capacity but prices are away from consumers. Low cost USB disks are now popular for consumers. Servers have to consider alternatives. Tape has long used robotic pick and place machines to eliminate the need for operators.

After decades of proprietary solutions the tape industry has finally agreed to some standards. Called linear tape open, the group have worked to improve speed and capacity. The latest LTO-7 tape drives in 2016 are close to $10,000 and tapes are about $150.

Backblaze and others have large capacity servers with as many as 4 rows of hard disks in them. 29″ EIA racks have room for 3 rows. Banks and banks of these server can easily achieve 100PB of storage with a few rows of servers using 10TB disks. Tape has now found a niche for backup but using multiple data centers and geo redundant replication, tape does not have much of a future.

Backup is not an archive. Archiving requires more to be durable. Archives may need to kept for many decades or more. Hard disks and tape both have shortcomings that make then barely suitable. Consumer Blu-Ray disks are fine for modest amounts of data, CD and DVD disks are less reliable. BD disks have been hardy with consumer needs which mean that photographs and other personal data can be relatively safe.

Games are now largely delivered by content delivery networks over the internet. The main problem is the solvency of the CDN. When GameSpy folded, the multiplayer services were lost. Many cloud storage servers were closed when rights holders seized them. New ones have emerged but the same problem of fault remains.

Larger amounts of data can be a real challenge. An archive may not be considered for 50 years such as when NASA attempted to find Apollo images for an anniversary. Failing to test archives is one of the biggest mistakes possible.

PC hard disk interfaces have changed several times over time. It’s not a bad idea to establish new machines and carefully copy data. Its very advisable to use integrity checking to be sure the copy is accurate, Tape drives have changed faster than most other technologies. Its almost a full time job keeping up with tape changes.

Rack mounted servers are now being sold by the millions as corporations galore are building more and more large data centers. Smaller businesses need to consider their own strategies.