In the Open Access unit we discussed making publications, often one of the primary outputs of research, freely accessible. In this module, we will take a step back and explore how increasing the openness of research workflows – the processes and methodologies used to conduct the research – can help improve research outcomes by increasing the amount and quality of information placed on the public record. Having more information available about how research will be, is, or was conducted can help other researchers validate and reproduce published research results, understand how a scholar developed an idea, increase evaluation from the research community, and allow for increased public engagement. It also buffers against future loss of scholarship due to a lack of ownership, digital decay, software deprecation, and missing files.
In other sections of this unit you will learn about best practices surrounding the use of open source software to help buffer against this loss and around open data to ensure that core research artifacts are recorded and shared in meaningful ways. But for now we will consider the steps we take to complete research work. Small decisions such as a consistent file-naming convention or making open backups of data stored in a proprietary tool can make it possible to continue work even a few years or decades down the line. These are small but intentional interventions but make all the difference to future scholars.
Defining a research workflow
A research workflow is a series of steps that processes and transforms information in some way, be it through a study design or deep reading. This series of steps should ideally be repeatable or understandable by others and, in the STEM fields, must be reproducible within disciplinary expectations. In the humanities and social sciences a research workflow may be more individual and less reliant on reproducibility by others but the process one took to reach an argument, and sources that were consulted, should be transparent to others.
A workflow may be straightforward and linear or can be complex, involving the addition of subsequent steps depending on how the process is going. Being able to communicate your workflow in all of its complexity is essential for supporting the validation of published findings or enabling other scholars to understand the context behind your reasoning. It is essential to new scholars engaging with your discipline for the first time.
A research workflow has many stages from ideation to output, and many tools come into play throughout this process. The 101 Innovations in Scholarly Communication Project surveyed tools used by researchers for different aspects of their work and has generally sorted academic workflows across disciplines into six higher-level categories: discovery, analysis, writing, publication, outreach, and assessment. Their circle of tools demonstrates the exponential growth in available workflow tools for each stage of this process.
Tools and workflow stages
For each stage in a research workflow, different tools contribute to each of these categories. Some categories have a higher number of proprietary tools that can impact our ability to see how work was done in a decade or two; others have more open tools. On the spectrum of what makes a tool open we generally refer to “open source” tools as the most open but a proprietary tool that uses open formats is more open than a proprietary tool with a custom file format. Sometimes the decision on what tool to use is determined in advance for you by best practices in your discipline. We will expand on the value of open-source tools later in this unit but for now it’s important to know that their ability to be referenced or even included wholesale, with all source code, in the record of a project provides rich context for how work was done and allows future scholars to repeat and build on what was done.
As we can see from the 101 Innovations project, for every stage of the research process there are many equally valid and useful tools to choose from, and it is possible to plan for a more open workflow by considering alternatives to what you might normally rely on. Beyond disciplinary considerations, open might also mean accessible to your community or easier to access within your project team. Using tools that enable internal openness can help you transition to open publications in the future and make sure that you keep a meaningful record of important information, and tools that are difficult for your team to use might get less use. Ultimately, the most important consideration for whether a tool makes sense for you is very context specific. No matter what tool you end up using you can go a long way to shaving an open workflow by ensuring that your metadata is sufficiently rich, that the tools you use are accessible to your team and your community, that you are able to export your work out of them in an open format, and that your files are predominantly stored in open file formats.
Workflows and “Open”
Now that we have a sense of what we mean when we say “workflow”, let’s consider what makes an open workflow. At its simplest, an open workflow is when each of step of the research process is openly shared through clear documentation that makes the research project transparent and reproducible. Clear documentation includes using best practices around file naming conventions, project metadata, file formats, etc. In fact, well curated metadata and open file formats go far in enabling the clarity and longevity of your work.
It’s important to remember that openness in workflows does not mean “exposed” but is rather about making sure that enough information about how you got from an idea to a conclusion, an inspiration, or an output, the tools that you used, and enough of the interim steps are shareable in a way that communicates the complex context of your work. It’s a way of reinforcing your work so that it won’t go away or become inaccessible in a few years when a tool or format stops being available.
In practice, open workflows may be context and discipline-specific but there are some key pieces that apply across the board. Openness for workflows generally comes down to how information is stored and used in every step of your work. For one project team, an open workflow may mean using best practices within your team with the eventual goal of opening your workflow publicly. For another, it could mean working in a fully open way where individuals outside of the project team are able to find and view project files and activity. On either side of this spectrum, the use of best practices around file naming conventions, project metadata, file formats, tools used, and many other nuances of workflow management help to enable meaningful long term preservation of research outputs.
Steps for getting started with open workflow
Small interventions go far. To set yourself up for an open workflow start planning from the start.
Consider some of the following questions:
- Are there existing open projects that can inspire your workflow?
- Are there tools with templates that can help you create a research plan or protocol; if your discipline does not create protocols, are there tools for open composition, citation sharing, or peer review that you can incorporate into your work?
- For each step in your process, are there relevant open tools and open file formats that you should keep in mind?
- As your project moves forward and you start creating outputs how will you keep track of things? Document everything and preregister important study design and analysis information to increase transparency and counter publication bias of negative results.
- Consider which open licenses can you add to your project outputs as you collect research data, create code, and compile materials. What are the best practices in your discipline?
As you plan the steps of your work it might be useful to centralize and organize your project management using an online platform, a central repository, or folder for all research files. You can also self-host your files which can be more secure but takes more planning. As soon as you are ready, plan to report and publish your methods explicitly and transparently to allow for replication. Using a central online tool such as OSF for your project can help make this process of exposing your steps easier. We’ll explore the practicalities of working with open workflows further in this module.
To learn more about open workflows, read the following:
- Weiland, S. (2018). The Scholarly Workflow in the Digital Age: What Do We Know? What Should We Do?. https://docs.lib.purdue.edu/charleston/2017/scholcomm/4/
- Two pieces from 2015 on the 101 Innovations in Scholarly Communications project.
Some material in this module was adapted from the Open Science Training Handbook, CC-0.