title |
---|
FAIR Computational Workflows |
Adapted from the article FAIR Computational Workflows https://doi.org/10.1162/dint_a_00033:
Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products.
Workflows can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance.
These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right.
We argue that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.
This page is a gathering of community resources and literature on FAIR Computational Workflows. Feel free to suggest a change to help improve this page!
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121
https://doi.org/10.1162/dint_a_00033
Events, presentations and activities are available in WorkflowHub in the FAIR Computational Workflows Team
Related past and upcoming events:
- 2022-01-10/--13: Semantic Web Infrastructures and Resources in Healthcare and Life Science (SWAT4LS), Leiden, The Netherlands
- 2021-11-03/--11: Research Data Alliance (RDA) 18th Plenary Meeting (virtual)
- 2021-12-07/--09: FORCE2021, FORCE11 annual conference (virtual)
- 2021-11-15: 16th Workshop on Workflows in Support of Large-Scale Science, SC21: The International Conference for High Performance Computing (Virtual)
- Carole Goble: FAIR Computational Workflows (keynote)
- 2021-10-24: DaMaLOS 2021 – 2nd Workshop on Data and research objects management for Linked Open Science, co-located with ISWC 2021
- Programme (proceedings to appear)
- 2021-06-16: GO FAIR US webinar: FAIR Workflows
- Carole Goble (2021): Towards FAIR Workflows
- Denis Yuen (2021): Dockstore and FAIR
- [Video recording]
- 2021-04-07: WorkflowsRI Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development
- 2021-03-30/2021-04-01: Collaboration Workshop CW21 – themes FAIR Research Software, Diversity and Inclusion, Software Sustainability
- 2021-01-13: WorkflowsRI Workflows Community Summit: Bringing the Scientific Workflows Community Together
- DaMaLOS 2020 – Workshop on Data and research objects management for Linked Open Science, co-located with ISWC
- 2020-11-30: FAIR Workflows workshop at International FAIR Convergence Symposium_, 2020-11-27 / 2020-12-04.
- 2020-09-12: Workshop on FAIR Computational Workflows, 19th European Conference on Computational Biology (ECCB 2020)
- Combined slide deck
- Video recording incl. talks:
- Carole Goble: Introducing FAIR Computational Workflows [slides]
- Sarah Cohen: A Review on the FAIR principles for computational workflows [slides]
- Mateusz Kuzak: Toward defining and implementing FAIR for research software [slides]
- Carole Goble: WorkflowHub and the Bioschemas profile [slides]
- Michael Crusoe: The Common Workflow Language and CWLProv
- Stian Soiland-Reyes: Packaging workflows with RO-Crate [slides]
- Simone Leo: Testing workflows: Life Monitor and OpenEBench [slides]
- Salvador Capella-Gutierrez: OpenEBench [slides]
- Björn Grüning: FAIR computational data analysis with Galaxy [slides]
- Alexander Peltzer: Nextflow & nf-core [slides]
- Carole Goble: Wrapup, FAIR for workflows [slides]
- 2019-09-24: Workshop on Research Objects (RO2019) at IEEE eScience 2019
- 2018-10-29: Workshop on Research Objects (RO2018) at IEEE eScience 2018.
Registries:
- WorkflowHub – a repository and registry of life science workflows
- Dockstore – sharing Docker Tools and Workflows for the Sciences
- bio.tools – registry of tools in life sciences
- nf-core – curated Nextflow workflows for bioinformatics
- Published Galaxy workflows on usegalaxy.eu and usegalaxy.org
Related projects and initiatives supporting FAIR Computational Workflows aims:
- EOSC-Life
- ELIXIR Europe
- BioExcel
- WorkflowsRI – Towards an Infrastructure for Enabling Systematic Development and Research of Scientific Workflow Management Systems
- FAIR Workflows – an NWO/eScience Center project
- FAIR for Research Software (FAIR4RS) – working group at Research Data Alliance
- UseGalaxy.eu – a European-wide Galaxy workflow platform
- BioExcel Building Blocks (biobb) – software library for interoperable biomolecular simulation workflows
Related standards for FAIR computational workflows:
- Common Workflow Language – Interoperable execution of computational workflows, supported by multiple engines and with strong support for workflow metadata
- RO-Crate – FAIR packaging of research outputs and metadata, including workflows
- Bioschemas – improve findability of FAIR life science resources on the Web, including computational workflows and computational tools
- Biocompute Objects and IEEE 2791-2020: standard for describing workflows in regulatory sciences.
Articles below are published as Open Access, or with green open access preprints where gold open access is not possible. Please let us know if you are unable to access any of our publications. To add to this list, please suggest a change.
Robin A Richardson, Remzi Celebi, Sven van der Burg, Djura Smits, Lars Ridder, Michel Dumontier, Tobias Kuhn (2021):
User-friendly Composition of FAIR Workflows in a Notebook Environment.
The Eleventh International Conference on Knowledge Capture (K-Cap2021).
https://arxiv.org/abs/2111.00831
Paul Brack, Peter Crowther, Stian Soiland-Reyes, Stuart Owen, Douglas Lowe, Alan R Williams, Quentin Groom, Mathias Dillen, Frederik Coppens, Björn Grüning, Ignacio Eguinoa, Philip Ewels, Carole Goble (2021):
10 Simple Rules for making a software tool workflow-ready
(submitted) Zenodo
https://doi.org/10.5281/zenodo.5636487
Stian Soiland-Reyes, Peter Sefton, Mercè Crosas, Leyla Jael Castro, Frederik Coppens, José M. Fernández, Daniel Garijo, Björn Grüning, Marco La Rosa, Simone Leo, Eoghan Ó Carragáin, Marc Portier, Ana Trisovic, RO-Crate Community, Paul Groth, Carole Goble (2021):
Packaging research artefacts with RO-Crate.
Data Science (accepted)
arXiv:2108.06503v1 [cs.DL]
https://doi.org/10.5281/zenodo.5146228
Jeremy Leipzig, Daniel Nüst, Charles Tapley Hoyt, Karthik Ram, Jane Greenberg (2021):
The role of metadata in reproducible computational research
Patterns 2(1):100322
https://doi.org/10.1016/j.patter.2021.100322
Sveinung Gundersen, Sanjay Boddu, Salvador Capella-Gutierrez, Finn Drabløs, José M. Fernández, Radmila Kompova, Kieron Taylor, Dmytro Titov, Daniel Zerbino, Eivind Hovig (2021):
Recommendations for the FAIRification of genomic track metadata [version 1; peer review: 2 approved]
F1000Research 10(ELIXIR):268
https://doi.org/10.12688/f1000research.28449.1
Anna-Lena Lamprecht, Magnus Palmblad, Jon Ison, Veit Schwämmle,
Mohammad Sadnan Al Manir, Ilkay Altintas, Christopher J. O. Baker, Ammar Ben Hadj Amor, Salvador Capella-Gutierrez,
Paulos Charonyktakis, Michael R. Crusoe,
Yolanda Gil, Carole Goble, Timothy J. Griffin,
Paul Groth, Hans Ienasescu, Pratik Jagtap,
Matúš Kalaš, Vedran Kasalica, Alireza Khanteymoori,
Tobias Kuhn, Hailiang Mei, Hervé Ménager, Steffen Möller, Robin A. Richardson,
Vincent Robert, Stian Soiland-Reyes, Robert Stevens, Szoke Szaniszlo,
Suzan Verberne, Aswin Verhoeven, Katherine Wolstencroft (2021):
Perspectives on automated composition of workflows in the life sciences [version 1; peer review: awaiting peer review].
F1000Research 10:897
https://doi.org/10.12688/f1000research.54159.1
Rafael Ferreira da Silva, Henri Casanova, Kyle Chard, Dan Laney, Dong Ahn, Shantenu Jha, Carole Goble, Lavanya Ramakrishnan, Luc Peterson, Bjoern Enders, Douglas Thain, Ilkay Altintas, Yadu Babuji, Rosa Badia, Vivien Bonazzi, Taina Coleman, Michael Crusoe, Ewa Deelman, Frank Di Natale & Paolo Di Tommaso (2021):
Workflows Community Summit: Bringing the Scientific Workflows Community Together.
Workflows RI Technical Report. arXiv:2103.09181
https://doi.org/10.5281/zenodo.4606958
Stian Soiland-Reyes, Genís Bayarri, Pau Andrio, Robin Long, Douglas Lowe, Ania Niewielska, Adam Hospital (2021):
Making Canonical Workflow Building Blocks interoperable across workflow languages.
Extended abstract (in prep for Data Intelligence), Zenodo.
https://doi.org/10.5281/zenodo.4602855
Carole Goble, Stian Soiland-Reyes, Finn Bacall, Stuart Owen, Alan Williams, Ignacio Eguinoa, Bert Droesbeke, Simone Leo, Luca Pireddu, Laura Rodriguez-Navas, José Mª Fernández, Salvador Capella-Gutierrez, Hervé Ménager, Björn Grüning, Beatriz Serrano-Solano, Philip Ewels, Frederik Coppens (2021):
Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory.
Extended abstract (in prep for Data Intelligence), Zenodo
https://doi.org/10.5281/zenodo.4605654
Daniel S. Katz, Morane Gruenpeter, Tom Honeyman, Lorraine Hwang, Mark D. Wilkinson, Vanessa Sochat, Hartwig Anzt, Carole Goble, FAIR4RS Subgroup 1 (2021):
A Fresh Look at FAIR for Research Software.
arXiv:2101.10883 [pdf]
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020):
FAIR Computational Workflows.
Data Intelligence 2(1):108–121
https://doi.org/10.1162/dint_a_00033
Janno Harjes, Anton Link, Tanja Weibulat, Dagmar Triebel, Gerhard Rambold (2020):
FAIR digital objects in environmental and life sciences should comprise workflow operation design data and method information for repeatability of study setups and reproducibility of results
Database 2020:baaa059
<https://doi.org/10.1093/database/baaa059
Anna-Lena Lamprecht, Leyla Garcia, Mateusz Kuzak, Carlos Martinez, Ricardo Arcila, Eva Martin Del Pico, Victoria Dominguez Del Angel, Stephanie Van De Sandt, Jon Ison, Paula Andrea Martinez, Peter Mcquilton, Alfonso Valencia, Jennifer Harrow, Fotis Psomopoulos, Josep Ll. Gelpi, Neil Chue Hong, Carole Goble, Salvador Capella-Gutierrez (2020):
Towards FAIR principles for research software.
Data Science 3(1) pp. 37–59.
https://doi.org/10.3233/DS-190026
Farah Zaib Khan, Stian Soiland-Reyes, Richard O. Sinnott, Andrew Lonie, Carole Goble, Michael R. Crusoe (2019):
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.
GigaScience 8(11):giz095
https://doi.org/10.1093/gigascience/giz095
Jeffrey M. Perkel (2018):
That's the way we flow.
Nature 573 149-150.
https://doi.org/10.1038/d41586-019-02619-z
Natalie J Stanford, Finn Bacall, Fatemeh Zamanzad Ghavidel, Martin Golebiewski, Inge Jonassen, Rune Kleppe, Olga Krebs, Hadas Leonov, Stuart Owen, Kjell Petersen, Maja Rey, Stian Soiland-Reyes, Kidane Tekle, Andreas Weidemann, Alan Williams, Ulrike Wittig, Katy Wolstencroft, Anders Goksøyr, Jacky L. Snoep, Jon Olav Vik, Wolfgang Müller, Carole Goble (2018):
FAIR Bioinformatics computation and data management: FAIRDOM and the Norwegian Digital Life initiative.
NETTAB 2018 Network Tools and Applications in Biology.
[preprint]
[preprint server]
Gil Alterovitz, Dennis A Dean II, Carole Goble, Michael R Crusoe, Stian Soiland-Reyes, Amanda Bell, Anais Hayes, Anita Suresh, Charles Hadley S King IV, Dan Taylor, KanakaDurga Addepalli, Elaine Johanson, Elaine E Thompson, Eric Donaldson, Hiroki Morizono, Hsinyi Tsang, Jeet K Vora, Jeremy Goecks, Jianchao Yao, Jonas S Almeida, Konstantinos Krampis, Krista Smith, Lydia Guo, Mark Walderhaug, Marco Schito, Matthew Ezewudo, Nuria Guimera, Paul Walsh, Robel Kahsay, Srikanth Gottipati, Timothy C Rodwell, Toby Bloom, Yuching Lai, Vahan Simonyan, Raja Mazumder (2018):
Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results.
PLOS Biology. 16(12):e3000099
https://doi.org/10.1371/journal.pbio.3000099
(bioXriv:191783)
Pablo Carbonell, Adrian J. Jervis, Christopher J. Robinson, Cunyu Yan, Mark Dunstan, Neil Swainston, Maria Vinaixa, Katherine A. Hollywood, Andrew Currin, Nicholas J. W. Rattray, Sandra Taylor, Reynard Spiess, Rehana Sung, Alan R. Williams, Donal Fellows, Natalie J. Stanford, Paul Mulherin, Rosalind Le Feuvre, Perdita Barran, Royston Goodacre, Nicholas J. Turner, Carole Goble, George Guoqiang Chen, Douglas B. Kell, Jason Micklefield, Rainer Breitling, Eriko Takano, Jean-Loup Faulon, Nigel S. Scrutton (2018):
An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals.
Communications Biology 1:66
https://doi.org/10.1038/s42003-018-0076-9
Stephen J Eglen, Ben Marwick, Yaroslav O Halchenko, Michael Hanke, Shoaib Sufi, Padraig Gleeson, R Angus Silver, Andrew P Davison, Linda Lanyon, Mathew Abrams, Thomas Wachtler, David J Willshaw, Christophe Pouzat, Jean-Baptiste Poline (2017):
Toward standard practices for sharing computer code and programs in neuroscience.
Nature Neuroscience 20, 770–773.
https://doi.org/10.1038/nn.4550 [bioRxiv preprint]
Steffen Möller, Stuart W. Prescott, Lars Wirzenius; Petter Reinholdtsen, Brad Chapman, Pjotr Prins, Stian Soiland-Reyes, Fabian Klötzl, Andrea Bagnacani, Matúš Kalaš, Andreas Tille, Michael R. Crusoe (2017):
Robust cross-platform workflows: how technical and scientific communities collaborate to develop, test and share best practices for data analysis.
Data Science and Engineering 2:232 pp 232–244.
https://doi.org/10.1007/s41019-017-0050-4