There’s little doubt that data drives business and is perhaps the most important asset financial institutions possess. To protect this precious resource, network administrators are tasked with installing any number of security controls and backups. However, IT decision makers often neglect one component in their data management strategy: data archival.
Too often the terms “archival” and “backup” are used interchangeably. While related, these two terms represent very different concepts. With data storage costs at an all-time low, it is easy to see why archival efforts might be relegated to the back burner. This article aims to clear up the confusing world of data archival.
No discussion of archival is complete without a comparison to data backups. Think of backups primarily as a redundancy, a short-term tool to ensure that data is not permanently lost due to hardware failure, virus outbreak, natural disaster, etc. Backups are essentially a copy of production data stored in a protected way. Data being backed up should be, for the most part, actively used and somewhat dynamic. The frequency of backups can vary depending upon how many minutes, hours, or days’ worth of potential data loss is deemed tolerable. No matter the frequency, backups should take place with rigid regularity. Backup solutions should put a premium on speed of writing data and reliability of that data.
Archival, on the other hand, refers to a systematic process of retiring static data from production data stores. Think of data archives as the long-term airport parking of the data management realm; while it may not be quite as convenient to access, it is at least a little cheaper. Archived data should be static in nature and rarely accessed – requirements that may represent a majority of the data on any given network. An organization may introduce data archival to fulfill an internal business data retention requirement or to comply with government regulation. Due to the regulations on community financial institutions, archival can be a useful tool to help segment and organize volumes of data that must be retained. Many financial institutions already use some form of archival, though it may be branded with a different name. Imaging systems, for example, are excellent candidates for implementing archival, and many imaging systems offer built-in archival capabilities or companion archival software.
Implementing archival technology can be as simple as an organized system of taking old data offline to external media (tape, DVD-R, external HDD), or can be as complex as adopting a full hierarchical storage management solution. One of the first decisions to make when introducing data archival is deciding whether to use an online solution for ease of access or a more cost-effective and tamper resistant offline option. Selection of an archival system should also take into account expected functional lifespan of storage media and long term accessibility concerns.
No matter which archival solution is used, archived data can stack up quickly. A data archival solution must have the capacity, or scalability to handle expected data growth. If using offline media, then this may be as simple as adding another tape or drive. It is also important to keep in mind that an enormous stack of data is hardly useful, so organization is essential to effective archival. For offline media, this is likely a labor of media labeling and tracking; however, more expensive online solutions are built with organization in mind and may also offer robust search capabilities.
Another guiding principal of archival is the idea of data authenticity. Archived data should be genuine and complete. A good example of this principle in action is email archival – all messages sent and received are chronicled through mail archival, including anything that an end-user may have immediately deleted. Collecting all email chains and making this repository write-protected ensures the data is authentic. Similarly, other categories of archived data should be protected from modifications by essentially rendering them as read-only.
Obviously, there are a number of factors to consider when discussing data archival strategies. Proper planning is key to the long-term usability of any archival implementation. All of this starts with a thorough analysis of the types of data on your network to determine the perceived value, retention requirements, and expected growth of each category. Administrators should strive to know not just the technology used in the network, but also to understand the data that resides on the network.