A data storage strategy outlines where and how you'll store your data as you work through your project. There are many things that should be considered when planning your data storage approach including access and security requirements, the level of data sensitivity, storage costs and any relevant legislation.
Having a well-thought out plan for your data storage will ensure that you can avoid:
Where you store your research data is a key component of responsible research and good research data management practice.
As outlined in the Australian Code for the Responsible Conduct of Research, 2018, you have a responsibility for storing and managing your research data in a manner appropriate to the risks associated with any confidentiality or sensitivities within the data and materials.
When you are selecting where to store research data, consider the following:
When deciding on storage and access systems, it's crucial to consider how sensitive the data is. Unauthorised disclosure of sensitive information can cause serious damage to individuals and organisations, so mitigating that risk is critical to the data management planning process and to the ethical conduct of research.
It may be useful to note that for most Notre Dame research involving humans, the classification should be Sensitive or Highly Sensitive.
In practice, examples of research data with a high degree of sensitivity which would place it into the Highly Protected classification include:
Information security specialists have developed a useful acronym to use to assess your plan for data storage:
C - Confidentiality: Your data shouldn't be made available to people who aren't authorised to view it.
I - Integrity: Your data should be kept accurate and complete - no-one should be able to edit it without your knowledge and permission.
A - Availability: Your data should be able to be accessed by the appropriate people when they need to in a useful way.
Balancing these three considerations can be difficult. Keeping all your research on a piece of paper in a tightly locked office might provide a good level of confidentiality, but it might not be very available to your collaborators. Putting all your working data up on an open Google Sheet that anyone can edit might result in great availability, but poor integrity. It's important for you to understand how confidential your data should remain; how many people should be allowed to edit/modify it; and how people will get to the data when needed.
Datasets should be identified and protected in a manner that is appropriate to their sensitivity and importance - not all research projects will require the same approach and not all data within a research project will require the same treatment; some access requirements will make particular approaches more suitable or less suitable for your research project. That's why it's good to consider your approach to these issues alongside each other.
In addition to the information security requirements of Notre Dame, you may also need to adhere to requirements from industry or governmental regulatory bodies, funder/grant bodies or from external collaborators. Careful consideration of all these demands early on can help avoid problems at later stages.
In practice, examples of research data with a high degree of sensitivity which would place it into the Highly Protected classification include:
Access to research data should be controlled by appropriate electronic safeguards and/or physical access controls including secure passwords. Each password used should be both secure and unique (see figure below). This is best achieved via a password manager, with a memorable passphrase as the master password (strong password advice) and using two factor authentication.
Password advice and image from ARDC IU RDM policy - Swinburne v0.12.1.docx https://zenodo.org/record/6859513/files/ARDC%20IU%20RDM%20policy%20-%20Swinburne%20v0.12.1.docx?download=1
It is recommended that the University's approved cloud storage options are used for active and archival data where possible.
Purpose and advantages* | Storage size | Access | Instructional material | |
---|---|---|---|---|
OneDrive |
Share and store active sensitive and non-sensitive data Allows synchronous document editing; internal and external collaboration; multifactor authentication |
Individual <1TB Groups >1TB |
Office.com |
OneDrive quick start guide (downloadable pdf) |
Teams |
Share and store active sensitive and non-sensitive data Allows synchronous document editing; internal and external collaboration; multifactor authentication; additional collaboration functions |
Individual <1TB Groups >1TB |
Office.com |
Teams quick start guide (downloadable pdf) |
Azure | Store archived data^ | Please contact IT | Please contact IT |
*See further details on storage options for active data in the table below
^The University’s long term storage solution for data for completed projects is Microsoft Azure. To archive: create a dataset in ResearchData@ND (Dataset Record section), and email the location of your data to itservicedesk@nd.edu.au.
Data loss, whether caused by technical or human error, can set your research back years. Backups and safeguarding refers to your steps and plans to minimise the risk of loss or destruction of your data. The specifics of your plan will depend on your dataset size, software or instrumentation used and your research process, but some recommendations are universal:
When considering the suitability of various storage methods for your research data, you should keep in mind that reliable storage methods come with a cost - and usually the greater the storage space required, the greater the cost.
Some cloud services may provide their services free of charge if your data size requirements are low. Others may charge you by the gigabyte or terabyte once you reach a certain dataset size; other will charge by how often you upload and download the data.
You may be asked to follow certain guidelines and processes in order to help Notre Dame control costs of providing you with large amounts of storage. These might include:
Whatever your choice of storage location, by staying aware of what data you need to store, you can control the cost to yourself or your institution.