forked from pjotrp/guix-notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
bosc.tex
86 lines (73 loc) · 3.68 KB
/
bosc.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
\documentclass[10pt,oneside]{article}
% \input{bio}
\usepackage[english]{babel}
%%%%%%%%%%%%%
\setlength{\textheight}{8.75in} %Letter is 11in, less 2 for margins, less 0.25 for footer
\setlength{\oddsidemargin}{0.0in} %gets +1inc
\setlength{\evensidemargin}{0.0in} %gets +1inch
\setlength{\textwidth}{6.50in} %Letter is 8.5, less 2 inches for margins
\setlength{\topmargin}{0.5in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\parindent}{0.25in}
%%%%%%%%%%%%
% \usepackage[numbers]{natbib}
% \usepackage{graphicx}
\usepackage[colorlinks=true,citecolor=black,urlcolor=blue]{hyperref}
\title{Reproducible bioinformatics software with GNU Guix}
\author{
\underline{Pjotr Prins}\footnote{University Medical Center Utrecht, The Netherlands, Email: [email protected]},
Ben Woodcroft\footnote{University of Queensland, Australia},
Ricardo Wurmus\footnote{Max-Delbr\"{u}ck-Centrum f\"{u}r Molekulare Medizin (MDC), Germany }
}
\begin{document}
%\pagestyle{empty}
%\thispagestyle{empty}
\maketitle
%\pagestyle{empty}
\thispagestyle{empty}
\vspace{-0.2in}
\noindent
Website: \url{https://www.gnu.org/software/guix/packages/} \\
Repository: \url{https://git.savannah.gnu.org/cgit/guix.git/} \\
License: GPL3 \\
Anyone who has been bitten by dependencies and would like a fully
reproducible software stack should take note. Through GNU Guix we lost
the fear of combining computer languages and binary deployment because
all dependencies, including command line invocations of tools, are
guaranteed to work.
In this talk I will share the great experience we have of packaging,
deploying, publishing and distributing software via GNU Guix of a
complex web service with hundreds of dependencies that has multiple
servers under \hbox{\url{http://genenetwork.org/}}. With GeneNetwork
we are creating an environment that people can do genetics on their
laptop through a front-end API, e.g., for R and Python, or the
browser, using content addressable storage, such as Arvados Keep, and
reproducible software deployment with GNU Guix, for analysis through
reproducible pipelines, such as PBS and CWL. We are even using GNU
Guix to deploy pipelines on the ORNL Beacon supercomputer.
HPC computing environment and especially super computing has its bag
of challenges when it comes to software deployment. As scientists we
often do not get root access which means that we either depend on what
software is available or we build software in a dedicated directory
using tools such as Brew, Conda or even from source. Unfortunately
these solutions depend on already installed tools from an underlying
distribution, often proprietary or dated compilers, and, for example,
modules. Any binary that gets produced, therefore, tends to be totally
unique, both in the generated binary and its set of dependencies. This
is bad. Bad for trouble shooting and bad for pursuing reproducible
science.
With GNU Guix we have packaged more than 300 R packages, 400 Python
packages, 500 Perl packages and 150 Ruby packages, including some 200
specific bioinformatics packages with some rather difficult to package
tools, such as Sambamba. Thanks to the GNU Guix community it is the
largest ongoing bioinformatics packaging attempt next to Debian BioMed
and Bioconda.
I will discuss the work on GNU Guix `channels', reproducible
build-systems and non-root installations and moving forward on putting
Guix in containers, using work flow engines, so that jobs can run on
distributed systems, such as Arvados. In this talk I will explain how
GNU Guix differs from distributions, such as Debian BioMed, how it can
happily be deployed on any existing distribution, why it does not
actually need containers, and how it can be part of Bioconda.
\end{document}