Guest Column | April 22, 2021

Why ‘Golden Copies' And Air Gaps Are The Future Of Data Storage

By Brian Murphy, Datadobi


Most of us have heard the time-old phrase, “A penny saved is a penny earned.” Every generation hears how it’s just as important to save as it is to work hard and make a living – rightfully so. That same philosophy applies to data: it’s just as important for a business to be able to store data safely as it is to create those critical assets.

Organizations have no problem creating data – 2.5 quintillion bytes of data are created every single day worldwide – but things can get problematic when that data needs to be stored, protected, and recalled quickly and reliably. Files can be created and stored all day long, but when they’re not protected, stored, and backed up properly, the risk of data loss skyrockets. This is especially true for unstructured data, such as emails, invoices, or video/audio files.

Traditionally, companies relied on NDMP backups as their primary data protection strategy. But with the amount of unstructured data growing within organizations today, data protection requirements are shifting. In fact, the prevalence of unstructured data has grown so large in the past several years that companies are forced to get creative to comply with data protection mandates.

Instead of solely going along with traditional NDMP, many companies are implementing snap and replication as their data protection strategy. This is troublesome, though, as ransomware can now target snapshots and sit idle for extended periods, turning replication into ransomware propagation. Because of this, it’s critical to maintain several point-in-time copies to maintain business continuity.

Take the financial services industry for example. These types of organizations depend on the ability to retain data long term on a reliable source with authentication requirements that meet regulatory mandates. These financial services customers must be able to granularly recover their files from any point-in-time image. They need to be able to choose what to protect down to the file level. These heightened demands on seemingly endless amounts of data demonstrate how important immutability and integrity are to so many organizations.

A promising solution is to air gap “golden copies” and store them in a bunker site, either in a remote data center or in the cloud. This approach is the best way to protect business critical NAS and object data against cyber threats, ransomware, accidental deletions, and software vulnerabilities, especially in the heterogeneous vendor environment we know today.

Many people think it’s a trivial task to take a file and store it as an object. As it turns out, the characteristics of file systems do not particularly map well to object storage. How the files are named and how that translates into an object key can be problematic. How do you maintain permissions? What about alternate data streams? How do you maintain settings like SMB share and NFS export definitions? All of these are important to maintain in the form of a “golden copy” in the event of a catastrophe.

Consider a library that has hundreds of thousands of books on its shelves, but they have no numbering system to locate a specific book. Most would consider this library quite useless. Similarly, the process of recalling data but not the permissions, associated data streams, and/or access points makes the point of the backup somewhat meaningless.

Furthermore, immutable copies of data must be air gapped. The term air gap refers to limited network connectivity between the source and the target sites. So, rather than constant network connectivity being available, the network connection is periodically activated to pull incremental updates from the source since that last transfer session was initiated.

With this approach, all objects must go through a checksum to verify the integrity of data being synced. Any point-in-time image created from any given source system can be recalled to any desired target system. This rids the restrictions imposed by outdated protocols like NDMP, which require parity between the original source system and the target of the recall.

Once a bucket or container is created on the object store and presented to the replication service, the software creates a unique, multi-level abstraction layer that allows the replicator to protect all file systems and their associated metadata. Just as a library filled with unique books that have their own authors, titles, date of publish, and so on, datasets have their own highly critical characteristics, such as NTFS permissions, ADS, POSIX mode bits, NFSv4 ACLs, and timestamps. These must be maintained accurately and in an immutable fashion to avoid the disaster ransomware can inflict.

Ninety-five percent of businesses cite the need to manage unstructured data as a problem for their business. While snapshots and replication are fine tools for quick, isolated recovery and for failover to a secondary system, it’s not an appropriate way to manage unstructured data in the face of rising ransomware threats. I’d encourage all business owners to seek out end end-to-end coverage of their unstructured datasets to add an extra line of defense to their organization.

About The Author

Brian Murphy is senior systems engineer at Datadobi.