Working in a reproducible and replicable manner first requires examining the research workflow and finding ways to improve the organization and share-ability of the materials, ephemera, data, instruments, and reasoning.
Documentation is a necessary part of reproducibility and replicability in research; however, to create useful documentation there are some key issues to address, including. The vast majority of the research workflow is obscured. There is either no detail or little detail around the workflow to allow others to reproduce the results of a study. Similarly the reasoning behind decisions that contributed to the use of specific computational tools or methods may never be communicated outside of the project team. Without the accumulated unpublished workflows and research outputs (e.g. data), it is difficult to perform additional analysis to expand upon the research or to apply the same approach to a new idea.
- Version Control – Version control is a system of recording changes made to a file or set of files over time to recall specific versions later. Version control is especially important in reproducibility and replicability as it allows researchers to track when issues were introduced into the research and the decision making behind those decisions. To learn more about version control, review the module on Open Software.
- Including appropriate meta information – From a Readme to a codebook, documentation about your project describes the content and structure of a project, data, and files within it. Developing README files and following metadata best practices help make projects more accessible; codebooks are intended as a self-explanatory guide to variables in a data file. This type of documentation supports the research team while also helping those interested in replicating or reproducing the study with the detailed information related data analysis within the project.
- File organization and sharing – Housing all materials related to the research study in one place that team members have access to, organized in a meaningful way, and with the application of best practices, helps a team out during the project and improves your ability to share it on.
Think back on a project, research or otherwise, from two years ago. Do you remember what you did? Do you remember the details of the project and how you completed it? Do you remember why you made the decisions you did around the project? Could you tell someone how to complete the project themselves with the same results?
Even if you have a very good memory, the answer is probably no.
We will explore increased documentation and organization in the next part of this module, Best Practices for Organizing.
Openly share materials as well as data
The most important factor in workflows for reproducibility and replicability is access to the materials that make up the research. It is difficult for research to be interrogated, reproduced, or replicated without access to the tools creating the data (e.g. research instruments such as surveys), scripts for manipulation and analysis, a record of the decisions about the specific tools and methods used, and the data used or collected. Sharing all the elements of the research project in one location with persistent identifiers (e.g. Digital Object Identifiers (DOIs) – unique identifiers associated with an electronic object that contains metadata that makes the object significantly easier to find and track how it has been cited) provides the opportunity for the researchers to engage in open workflows while supporting the scientific ideal of reproducible results.
To learn more, review the following:
- Open Textbook chapter on open science practice.
- Grüning, B., Chilton, J., Köster, J., Dale, R., Soranzo, et.al. (2018). Practical computational reproducibility in the life sciences. Cell systems, 6(6), 631-635. 10.1016/j.cels.2018.03.014
- Stodden, V. C. (2011). Trust your science? Open your data and code. https://academiccommons.columbia.edu/doi/10.7916/D8KD27BK/download
- Leek, J. T. (2016). How to be a modern scientist.
- Go further with the free “Reproducible Research” Massive Open Online Course from Johns Hopkins.
Adapted from Reproducible Research Practices Slides by Bowman, Sara D, Brian A Nosek, Andrew Sallans et.al. licensed under a CC0 1.0 Universal.