Skip to Main Content
Notre Dame 5 Star University
University Library

 

 

    

Research Data Management

Storage considerations

A data storage strategy outlines where and how you'll store your data as you work through your project. There are many things that should be considered when planning your data storage approach including access and security requirements, the level of data sensitivity, storage costs and any relevant legislation.

Having a well-thought out plan for your data storage will ensure that you can avoid:

  • data loss from human or technical error
  • data breaches, which may be malicious or accidental
  • breaching funder or legislative requirements
  • collaborator frustration from unforeseen access issues

Storage

Where you store your research data is a key component of responsible research and good research data management practice.

As outlined in the Australian Code for the Responsible Conduct of Research, 2018, you have a responsibility for storing and managing your research data in a manner appropriate to the risks associated with any confidentiality or sensitivities within the data and materials.

When you are selecting where to store research data, consider the following:

Research data:
  • What are the security classifications of the data to be stored?
  • How does the data need to be accessed and used?
  • Who will require access to the data?
Storage option:
  • Is it appropriate for the security classification of your data?
  • What security systems are in place?
  • What recovery procedures are in place?
  • What is the availability of support by professional IT staff?

When deciding on storage and access systems, it's crucial to consider how sensitive the data is. Unauthorised disclosure of sensitive information can cause serious damage to individuals and organisations, so mitigating that risk is critical to the data management planning process and to the ethical conduct of research.

It may be useful to note that for most Notre Dame research involving humans, the classification should be Sensitive or Highly Sensitive.

Highly sensitive research data

In practice, examples of research data with a high degree of sensitivity which would place it into the Highly Protected classification include:  

  • health information that allows identification of individuals
  • identifiable information relating to an individual's ethnic or racial origin, political opinions, religious or philosophical beliefs, trade union membership, or sexual activities
  • information about the nature and location of a place or an item of Aboriginal and/or Torres Strait Islander cultural significance, including secret or sacred practices
  • confidential commercial agreements and other commercially sensitive information 
  • ecological information that may place vulnerable species or ecological communities at risk
  • data that must comply with regulations, such as data that is subject to Defence Trade Controls Act or Defence and Strategic Goods List. 

Notre Dame sensitivity table:

Storage sensitivity table

Information security specialists have developed a useful acronym to use to assess your plan for data storage:

C - Confidentiality: Your data shouldn't be made available to people who aren't authorised to view it.
I - Integrity: Your data should be kept accurate and complete - no-one should be able to edit it without your knowledge and permission.
A
- Availability:
Your data should be able to be accessed by the appropriate people when they need to in a useful way.

Balancing these three considerations can be difficult. Keeping all your research on a piece of paper in a tightly locked office might provide a good level of confidentiality, but it might not be very available to your collaborators. Putting all your working data up on an open Google Sheet that anyone can edit might result in great availability, but poor integrity. It's important for you to understand how confidential your data should remain; how many people should be allowed to edit/modify it; and how people will get to the data when needed. 

Datasets should be identified and protected in a manner that is appropriate to their sensitivity and importance - not all research projects will require the same approach and not all data within a research project will require the same treatment; some access requirements will make particular approaches more suitable or less suitable for your research project. That's why it's good to consider your approach to these issues alongside each other.

In addition to the information security requirements of Notre Dame, you may also need to adhere to requirements from industry or governmental regulatory bodies, funder/grant bodies or from external collaborators. Careful consideration of all these demands early on can help avoid problems at later stages.

Highly protected research data

In practice, examples of research data with a high degree of sensitivity which would place it into the Highly Protected classification include:  

  • health information that allows identification of individuals
  • identifiable information relating to an individual's ethnic or racial origin, political opinions, religious or philosophical beliefs, trade union membership, or sexual activities
  • information about the nature and location of a place or an item of Aboriginal and/or Torres Strait Islander cultural significance, including secret or sacred practices
  • confidential commercial agreements and other commercially sensitive information 
  • ecological information that may place vulnerable species or ecological communities at risk
  • data that must comply with regulations, such as data that is subject to Defence Trade Controls Act or Defence and Strategic Goods List. 

Strong password advice

Access to research data should be controlled by appropriate electronic safeguards and/or physical access controls including secure passwords. Each password used should be both secure and unique (see figure below). This is best achieved via a password manager, with a memorable passphrase as the master password (strong password advice) and using two factor authentication.

Security using a password manager, multi-factor authentication and passphrase

Password advice and image from ARDC IU RDM policy - Swinburne v0.12.1.docx https://zenodo.org/record/6859513/files/ARDC%20IU%20RDM%20policy%20-%20Swinburne%20v0.12.1.docx?download=1

It is recommended that the University's approved cloud storage options are used for active and archival data where possible.

Data storage solutions
  Purpose and advantages* Storage size Access Instructional material
OneDrive

Share and store active sensitive and non-sensitive data

Allows synchronous document editing; internal and external collaboration; multifactor authentication

Individual <1TB

Groups >1TB

Office.com

OneDrive quick start guide (downloadable pdf)

OneDrive video training

Teams

Share and store active sensitive and non-sensitive data

Allows synchronous document editing; internal and external collaboration; multifactor authentication; additional collaboration functions

Individual <1TB

Groups >1TB

Office.com

Teams quick start guide (downloadable pdf)

Microsoft Teams video training

Azure Store archived data^ Please contact IT Please contact IT  


*See further details on storage options for active data in the table below
^The University’s long term storage solution for data for completed projects is Microsoft Azure. To archive: create a dataset in ResearchData@ND (Dataset Record section), and email the location of your data to itservicedesk@nd.edu.au.

Data loss, whether caused by technical or human error, can set your research back years. Backups and safeguarding refers to your steps and plans to minimise the risk of loss or destruction of your data. The specifics of your plan will depend on your dataset size, software or instrumentation used and your research process, but some recommendations are universal:

  1. Backups should occur at regular intervals and whenever major changes are made.
  2. Store your backups on multiple types of storage in multiple locations. The link below in the 3-2-1 Backup Method describes this more.

When considering the suitability of various storage methods for your research data, you should keep in mind that reliable storage methods come with a cost - and usually the greater the storage space required, the greater the cost.

Some cloud services may provide their services free of charge if your data size requirements are low. Others may charge you by the gigabyte or terabyte once you reach a certain dataset size; other will charge by how often you upload and download the data.

You may be asked to follow certain guidelines and processes in order to help Notre Dame control costs of providing you with large amounts of storage. These might include:

  • Archiving inactive files with archiving/compression software (such as WinZip or 7-Zip) will help reduce the size of data stored.
  • When shifting the file formats of your data, move the original files to an archival storage location to save space.
  • When working with downloaded datasets, only keep the files that are relevant to your work and delete the unnecessary files/folders.

Whatever your choice of storage location, by staying aware of what data you need to store, you can control the cost to yourself or your institution.