Skip to Main Content
Notre Dame 5 Star University
University Library

 

 

    

Research Data Management

Reuse

Starting your own research from the existing data collected by other researchers can have some major benefits:

  • much of the background work has already been completed making it easier to undertake further research
  • it's time-saving and cost efficient due to the reduced cost assocaiated with duplication of data
  • the data comes with a degree of pre-established validity and reliability
  • potential for collaboration opportunities

However, careful consideration is required before reusing data. Make your search for data as efficient as possible by thinking about the following questions before you get started:

  • Is there enough description about the content of the data? Is the context of the research relevant?
  • Is the source trustworthy? Is it produced by a reputable organisation or researcher in the field?
  • Do you know how long the data will be stored and made available?
  • Which file formats am I able to work with? If you are planning on using analytical software, ensure you know the formats that are compatible.

Licenses and user agreements

  • Are there restrictions or specifications of data reuse? For instance, if you plan on commercialising your research you should avoid datasets that have non-commercial conditions in their reuse licence.
  • What will be the impact of these restrictions on your research?

Methodology

  • What is the relationship between existing and new data?
  • How will the data be integrated?
  • How will any format differences be managed?

Considering these aspects will help you determine if the data is suitable for you to reuse and help you avoid investing effort and time in analysing data that is unsuitable. 

When reusing the data of others, it's critical to give proper attribution to the work of the original creator. This is called data citation and refers to the practice of referencing data to acknowledge it's source, in the same way as referencing a book or journal article.

Citing data is important because it:

  • Acknowledges and provides credit to the originator of the data
  • Allows replication or verification of the new results and data, improving their reliability and validity
  • Enables the collection of statistics on the impact of the data (data citation metrics)

However, because the citation of data is a relatively new practice, the standards to follow are often unclear - referencing software like EndNote does have a template for datasets, but other requirements may mean the generated references need to be modified.

Order of precedence:

  1. Any guidelines from your editor or publisher
  2. Any guidelines from your Style Guide or Publication Manual
  3. Any guidelines from the data source (either the dataset creator or the data repository)

If these requirements are unclear or informal, DataCite recommends including the following elements (note this follows the APA 7 style guidelines): 

Creator(s). (Publication Year). Title. Version. Publisher. ResourceType. Identifier

  • Publication year is the date when the dataset was published (not the collection or coverage date)
  • Publisher refers to the repository or data centre where the data is stored
  • Identifier should be displayed as a linkable, permanent URL or DOI. The DOI ensures that data can be discovered online, regardless of where it is located.
  • Version and ResourceType may be added where desirable.

Finding datasets