“Research data are data that are used as primary sources to support technical or scientific enquiry, research, scholarship, or creative practice, and that are used as evidence in the research process and/or are commonly accepted in the research community as necessary to validate research findings and results. Research data may be experimental data, observational data, operational data, third party data, public sector data, monitoring data, processed data, or repurposed data. What is considered relevant research data is often highly contextual, and determining what counts as such should be guided by disciplinary norms.”– Frequently Asked Questions Tri-Agency Research Data Management Policy
Use: FileNm_Guidelines_20140409_v01.docx
Don’t Use: FileNm_Guidelines_20140409_Review.docx AND FileNm_Guidelines_20140409_Investigation.docx
Why? Because two years from now, you won’t remember what you meant.
A good file naming system will replace an extensive folder hierarchy. Limit the number of nested folders and strive to make hierarchies as simple as possible. Complex folder hierarchies are harder to navigate and offer more opportunities for filing errors. System back-ups may take longer.
Use: F:/ Env/LIBR/DataMgmt_FileFormats_20140409_v01.docx
Don’t Use: F:/Environment/Library/Woodward/Data/Education/Materials/Draft/2014/04/-DataMgmt_FileFormats_20140409_v01.docx
Why? Because complex folder hierarchies are harder to navigate and offer more opportunities for filing errors. System back-ups may take longer.
From the UBC Guide for Organizing data
Open (i.e., non-proprietary) file formats are preferred when possible because they can be used by anyone, which helps ensure interoperability and so others can access and reuse your data in the future. UK Data Service provides a table of recommended and acceptable file formats for various types of data.
TIFF version 6 uncompressed (.tif)
Free Lossless Audio Codec (FLAC) (.flac).
It is important to keep track of different copies or versions of files, files held in different formats or locations, and information cross-referenced between files. This process is called 'version control'. Logical file structures, informative naming conventions, and clear indications of file versions all contribute to better use of your data during and after your research project.
File names should contain information (e.g., date stamps, participant codes, version numbers, location, etc.) that helps you sort and search for files and identify the content and right versions of files. Version control means tracking and organizing changes to your data by saving new versions of files you modified and retaining the older versions.
Good data organization practices minimize confusion when changes to data are made across time, from different locations, and by multiple people. Read more on file naming and version control at UBC Library, and UK Data Service.
Here are some recommended conventions:
Metadata is data about data or “documentation that describes data” (Cornell University). It is “structured data about anything that can be named, such as Web pages, books, journal articles, images, songs, products, processes, people (and their activities), research data, concepts, and services.” (DCMI website). Metadata makes it possible for others to understand how your data was collected, what it means, and what it can be used for. and how to interpret it. Documentation involves recording important metadata about the dataset structure and contents.
Documentation involves recording important pieces of information about the dataset structure and contents. Project-level metadata can include basic information about your project (e.g., title, funder, principal investigator, other people involved in the project and their roles, etc.), research design (e.g., background, research questions, aims, artists or artwork informing your project, etc.) and methodology (e.g., description of artistic process and materials, interview guide, transcription process, etc.). Item-level metadata should include basic information about creative outputs and their documentation (e.g., creator, date, subject, copyright, file format, equipment used for documentation, etc.). This information can be entered in a _README file in the root folder of your dataset.
A README is portable, durable way to provide information to other researchers about how to use your dataset.
A README is a guide to your dataset and is usually a plain text file to maximize its usability and long-term preservation potential. The purpose of a README is to assist other researchers to understand your dataset, its
contents, provenance, licensing and how interact with it. README files are generally named _README, _readme.txt or _read-me.md and are included as component of a dataset.
A README complements but does not replace the metadata that data repositories ask you to provide when you deposit your data. The best practice is to record information in both the repository’s metadata and the README. The repository’s metadata will support findability within and between data repositories while the README is portable and continues to describe the dataset after it has been separated from its original context. In all cases, you should use any conventions appropriate to your discipline to record the information about your dataset.
Content from the UBC Quick Guide to Creating a README File, version 1.2 (CC-BY)
A good Readme guide is available from Cornell University.
Core elements of any README include:
Content from the UBC Quick Guide to Creating a README File, version 1.2 (CC-BY)
Controlled vocabularies are a kind of metadata standard that features a set of expert-curated preferred terms used for indexing or searching within a particular subject domain. Some forms of controlled vocabularies are term lists, authority files, taxonomies, and thesauri.
Controlled vocabulary terms improve search results in two ways:
Using controlled vocabularies in the creating of data or metadata supports accuracy, consistency, and interoperability. There are well-established vocabularies for a variety of subjects, including personal and corporate names, geographic names, topics, concepts, resource types and genres, and languages.
Content from the KPU Research Data Management Guide.
SOURCE | CONTENT | URL |
---|---|---|
Cataloging Cultural Objects (CCO) |
describe, document, and catalog cultural artifacts (like art and architecture) and visual media that represent them |
|
Dublin Core (DCMI Schemas) |
general purpose, widely used schema that can used in combination with metadata terms from other, compatible vocabularies in the context of application profiles |
https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ |
Getty Research Institute Vocabularies |
geographic names, art & architecture, cultural objects, artist names |
|
VRA Core |
a data standard for the description of images and works of art and culture |
A metadata standard is a set of established categories you can use to describe your data. It’s recommended that you use one to help ensure your metadata is consistent, structured, and machine-readable, which is essential for depositing data in repositories and making them easily discoverable by search engines.
For more help finding a suitable metadata standard, you can contact the ECU library or reach out to the Portage DMP Coordinator at support@portagenetwork.ca.
The ECU Library uses the DataCite metadata schema (which maps to Dublin Core and DDI, which are two widely used general metadata standards).
Additionally, there are discipline-specific standards used by museums and galleries that may be useful to describe artworks or design objects, etc. at the item level (e.g., CCO, VRA Core). You can also explore arts-specific data repositories at re3data.org to see what metadata standards they use.
library@ecuad.ca 604-844-3840 520 East 1st Avenue, Vancouver, BC