What are the active measures I need to put into place to address issues around data archiving and storage?
1. What is data archiving and why is it important?
Working in the heritage sector, you know that some of the data your organisation creates today will become tomorrow’s historical artefact. But creating data is the easy part – you need to be able to find, access and use that data in the future.
The problem is that digital data is fragile, as are the systems used to store it. You’re probably familiar with how to preserve physical files, but without a proactive and long-term strategy for storing and archiving your digital files, they won’t last for as long as you need them.
In this guide, our information specialist, Elissa Truby, will show you how to choose a suitable storage solution for your data based on:
- how long you want to keep it
- how you want to access it
- whether you have the resources to manage it yourself.
She will also explain how to select data for archiving and how to document your decisions.
What exactly is data archiving?
The technology industry has borrowed the term ‘archiving’ from the heritage sector and changed its meaning slightly. For example, when you ‘archive’ an email it means moving it out of your inbox, not that you’ve done enough to preserve it forever. Archiving is different to backup, which means making copies of your data for disaster recovery, although backup also plays an important role in protecting your data.
Data archiving here means moving the older data you want to keep for future reference from the area you use to store your current work to a location designed for long-term preservation. Separating the storage of your current and archival data will help you to:
- reduce storage costs by keeping data you don’t access often on less expensive media
- free up space in your active work area to increase efficiency and productivity
- protect your most valuable data from accidental modification or deletion so that it remains authentic over time.
2. Steps towards storing and archiving your data
Step 1: Understand your data archiving needs
The first step is to identify the different types of data your organisation creates and collects. Not all your data is equal – think of the number of emails you send a day for short-term activities. Could it be that some of the data you’re storing is no longer useful and is only taking up space?
At the same time, all companies need to keep certain types of data to comply with the law. You might only need to keep this data for a set time without needing instant access to all of it.
As a heritage professional, you also want to consider the lasting historical, cultural or social value of your data, and preserve some of your data for future generations or share it with external researchers. Many heritage funding bodies will ask for evidence of how you plan to store your data over time before they’ll award financial support.
No matter what data you have, think of your archive as a treasure trove of information, not just ‘old stuff’. Focus your time, effort and money archiving your most valuable data.
Step 2: Choose your storage solutions
No storage option is a permanent fix because no technology is built to last forever. That’s why digital preservation experts recommend using a range of storage mediums in case something goes wrong. Whichever option you choose, you need to regularly monitor your storage hardware and have the resources in place to manage and maintain it – both in terms of budget and technical skills.
There are several storage solutions available depending on your organisation’s needs and several are listed below:
Cloud storage
Cloud-based storage is a popular choice for data archiving because you can access your data quickly over the internet wherever and whenever you need it. This works well if your organisation functions across multiple sites, if staff and volunteers work remotely or if you want to provide external access.
Your data is stored offsite and managed by a third-party. Many providers offer built-in archiving services. In fact, several commercial digital preservation and repository systems use cloud technology. This makes cloud storage a practical solution if you don’t have technical staff at hand.
Cloud storage is usually cost effective. You can store large volumes of data and grow your storage capacity as needed. You’re normally billed monthly for the amount of storage and services you use but beware of extra charges for data retrieval.
A big drawback of the cloud for archiving is that by outsourcing responsibility to a third-party, you lose some control over your data. Always have an exit strategy so you can get your data back should you change your mind. Select a trusted provider who has experience of working with heritage organisations like yours and who understands digital preservation principles.
Local disk drives
Disk storage is a type of onsite storage that’s attached to your organisation’s network. There are different types of disk drives available, which vary in speed and cost. If you already use network drives, you know it’s easy to directly access your data, organise it and find it.
However, disk storage can be expensive, especially if you have a lot of data or very large files. Disk drives need to be kept in a climate-controlled environment, meaning high energy costs and regular cleaning. They also take up a lot of physical space, which isn’t always a luxury available to heritage organisations.
The biggest downside of disk storage for archiving is that the hardware has a short lifespan and can fail suddenly without warning. Disk drives need to be managed by staff with IT knowledge who can monitor the hardware for preservation issues and migrate data as systems age.
Tape drives
Tape drives are storage devices that use magnetic tape to read, write and record data. Your data is normally stored offline, but you can attach tape drives to a computer. Companies have used tapes for archiving since the 1950s, so it’s considered relatively reliable. It’s hard to modify data on tapes, making this a highly secure option.
The main issue with tape storage is that it can take a long time to access your data and it comes with limited search and indexing capabilities. This means tapes aren’t suitable if you need instant access, and you should be extra careful in documenting which tapes you use to store what data.
Don’t be fooled by tape’s reputation as a guaranteed solution for long-term preservation – no such thing exists! While tape drives normally last longer than disk drives, there are still risks of damage and data corruption.
Step 3: Create a data storage and archiving policy
Once you’ve looked at the different types of data you have, and explored your options for storing your archival data, develop a policy which sets out:
- criteria for what to archive and where
- the type of storage media you’ll use
- responsibilities for archiving data and managing storage
- rules for data access.
3. Set your strategy in motion
Now you’ve learnt what data archiving involves, and considered some of the options for storing your data, reflect on this by downloading our ‘Questions to help you choose the right archival storage solution’ (PDF file, 329kb).
Further resources
You can find out more about digital archiving and preservation by visiting the Digital Curation Centre or Digital Preservation Coalition and reading their useful guidance.
Please attribute as: "What are the active measures I need to put into place to address issues around data archiving and storage? (2022) by Elissa Truby supported by The National Lottery Heritage Fund, licensed under CC BY 4.0