Skip to content

Commit

Permalink
Writing own Dockerfile example
Browse files Browse the repository at this point in the history
  • Loading branch information
grantmcdermott committed May 17, 2021
1 parent 185bb25 commit efae1ba
Show file tree
Hide file tree
Showing 3 changed files with 434 additions and 99 deletions.
152 changes: 114 additions & 38 deletions 13-docker/13-docker.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ knit_hooks$set(
- [Base R](#r-base)
- [RStudio+](#rstudio+)

4. [Building your own Dockerfiles & images](#building)
4. [Writing your own Dockerfiles & images](#writing)

5. [Sharing files with a container](#share)

Expand Down Expand Up @@ -248,9 +248,9 @@ name: examples

It should now be clear that Docker is targeted at (and used by) a bewildering array of software applications.

In the realm of economics and data science, that includes every major open-source programming language and software stack.<sup>1</sup> For example, you could download and run a [Julia container](https://hub.docker.com/_/julia/) right now if you so wished.
In the realm of economics and data science, that includes every major open-source programming language and software stack.<sup></sup> For example, you could download and run a [Julia container](https://hub.docker.com/_/julia/) right now if you so wished.

.footnote[<sup>1</sup> It's possible to build a Docker image on top of proprietary software ([example](https://github.com/mathworks-ref-arch/matlab-dockerfile)). But license restrictions make this complicated. I've rarely seen it done in practice.]
.footnote[<sup></sup> It's possible to build a Docker image on top of proprietary software ([example](https://github.com/mathworks-ref-arch/matlab-dockerfile)). But license restrictions make this complicated. I've rarely seen it done in practice.]

--

Expand Down Expand Up @@ -286,7 +286,7 @@ A quick note on these `docker run` flags:

# Example 1: Base R (cont.)

As promised, here is a GIF of me running the command on my system. The whole thing takes about a minute and takes me directly into an R session.
As promised, here is a GIF of me running the command on my system. The whole thing takes about a minute and launches directly into an R session.

```{r docker_r_base_gif, eval = TRUE, echo=FALSE, out.width="75%"}
knitr::include_graphics("pics/docker-r-base.gif")
Expand Down Expand Up @@ -372,6 +372,7 @@ Let's try the [`tidyverse`](https://hub.docker.com/r/rocker/tidyverse) image fro
*Again, this next line will take a minute or three to download and extract the first time. But the container will be ready for immediate deployment on your system thereafter.*

---
name: tverseinit
count: false

# Example 2: RStudio+ (cont.)
Expand Down Expand Up @@ -468,26 +469,27 @@ I can also load the **tidyverse** straight away. (We can ignore those warning me

# Example 2: RStudio+ (cont.)

To stop this container, open up a new terminal window. Grab the container ID if you've forgotten it with `$ docker ps`. Then run:
To stop this container, you would grab the container ID (i.e. with `$ docker ps`) and then run:

```{bash}
docker stop <containerid>
```

Please don't do this yet, however! I want to continue using this running container in the next section.

--

</b>
</b></br>

**Aside:** Recall that we instantiated this container as a detached/background process (`-d`).

```{bash}
docker run `-d` -p 8787:8787 -e PASSWORD=pswd123 rocker/tidyverse:4.0.0
```

If you dropped the `-d` flag and re-ran the above command, your terminal would stay open as an ongoing process. (Try this yourself.)
- Everything else would remain the same. You'd still navigate to `<IPADDRESS>:8787` to log in, etc.
- However, I wanted to mention this non-background process version because it offers another way to shut down the container: Simply type `CTRL+c` in the (same, ongoing process) Terminal window. Again, try this yourself.
- Confirm that the container is stopped by running `$ docker ps`.
If you dropped the `-d` flag and re-ran the above command, your terminal would stay open as an ongoing process. (Try this for yourself later.)
- Everything else would stay the same. You'd still log in at `<IPADDRESS>:8787`, etc.
- However, I wanted to mention this non-background process version because it offers another way to shut down the container: Simply type `CTRL+c` in the (same, ongoing process) Terminal window. Again, try this for yourself later.

---

Expand Down Expand Up @@ -525,37 +527,43 @@ All of which provides a nice segue to our next section...

---
class: inverse, center, middle
name: building
name: writing

# Building your own Dockerfiles & images
# Writing your own Dockerfiles & images
<html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html>

---

# Add to an existing container

The easiest way to start building our own Docker images is by layering on top of an existing container.
The easiest way to start writing our own Docker images is by layering on top of existing containers.
- Remember: [Like ogres](https://www.youtube.com/watch?v=aJQmVZSAqlc), Docker containers are all about layers.

--

Let's see a simple example where we add an R library to our `tidyverse:4.0.0` image. First, make sure that the container is running again:
Let's see a simple example where we add an R library to our `tidyverse:4.0.0` image. First, make sure that the container is still running. You should see something like:

```{bash}
docker ps ## Check that the tidyverse:4.0.0 container isn't still running
docker run -d -p 8787:8787 -e PASSWORD=pswd123 rocker/tidyverse:4.0.0 ## Run it
docker ps
```
```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
802dbd3841c7 rocker/tidyverse:4.0.0 "/init" 8 minutes ago Up 8 minutes 0.0.0.0:8787->8787/tcp, :::8787->8787/tcp sweet_maxwell

```

*(If you don't see something like the above, please re-start your container and then log-in to RStudio Server, using the same steps that we saw [previously](#tverseinit).)*

--

Then log into RStudio Server through your browser, exactly like we did [earlier](#login). Once you are in RStudio, install **data.table** like you would any normal R library.
Once you are in RStudio, install **data.table** like you would any normal R library.
- I'm not going to show this with a GIF, but either use RStudio's library installer or run `install.packages("data.table")`.

---

# Add to an existing container (cont.)

You should now have data.table installed on your running container.
Okay, data.table should now be installed on your running container.

**Question:** If you stopped your container and restarted it, would data.table still be there?

Expand All @@ -575,14 +583,14 @@ I'm going to show you how on the next slide to keep everything in one place...

# Add to an existing container (cont.)

**Step 1:** Run `docker ps` again to ID the running container. You should see something like:
**Step 1:** Run `docker ps` to ID the running container. We've already done this, but...

```{bash}
docker ps
```
```bash
```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
802dbd3841c7 rocker/tidyverse:4.0.0 "/init" 2 minutes ago Up 2 minutes 0.0.0.0:8787->8787/tcp, :::8787->8787/tcp sweet_maxwell
802dbd3841c7 rocker/tidyverse:4.0.0 "/init" 8 minutes ago Up 8 minutes 0.0.0.0:8787->8787/tcp, :::8787->8787/tcp sweet_maxwell

```
--
Expand Down Expand Up @@ -634,32 +642,92 @@ root@802dbd3841c7:/# htop <span class="hljs-comment">## Show all available CPU c

--

(Obviously, you'd have to commit this change to keep it.)
(Obviously, you'd now have to commit this change to add `htop` to your image.)

---

# Write your own Dockerfile(s)
# Aside: Stop your container(s)

Okay, now is a good time to stop your container if you haven't done so already. Grab your container ID and run:

```{bash}
docker stop <container-id>
```

Alternatively, you can stop all running containers with the following command:

```{bash}
docker stop $(docker ps -q)
```

---

Recall that Dockerfiles are the "sheet music" of the whole operation. These are simple text files that provide the full set of instructions for building our Docker images.
# Write your own Dockerfile

Recall that `Dockerfiles` are the "sheet music" of the whole operation. These are simple text files that provide the full set of (shell) instructions for building our Docker images.

--

There is a whole host of [commands and considerations](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) for writing your own Dockerfiles... all of which I am going to elide over for this lecture. (We simply don't have the time.)
There is a whole host of [commands and considerations](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) for writing your own Dockerfiles &mdash; all of which I am going to skip for this lecture. (We simply don't have the time.)

--

BUT... I will briefly say that the [Rocker Project](https://github.com/rocker-org/rocker-versioned2#modifying-and-extending-images-in-the-new-architecture) again has our back with a bunch of ready-made scripts for building on and extending their Docker images.
BUT... I will briefly say that the Rocker Project again has our backs with a bunch of [ready-made scripts](https://github.com/rocker-org/rocker-versioned2#modifying-and-extending-images-in-the-new-architecture) for building on and extending their Docker images.

--

For example, if we wanted to modify the `tidyverse:4.0.0` image so that it also included Python, our Dockerfile would be as simple as the following two lines:
For example, if we wanted to modify the `r-ver4.0.0` image so that it also included Python, our Dockerfile would be as simple as the following two lines:

```docker
FROM rocker/tidyverse:4.0.0
FROM rocker/r-ver:4.0.0
RUN /rocker_scripts/install_python.sh
```
</br></br>

.pull-right[*Continues on next slide.*]

---

# Write your own Dockerfile (cont.)

```docker
FROM rocker/r-ver:4.0.0
RUN /rocker_scripts/install_python.sh
```
--

Try this yourself by creating a file called `Dockerfile`<sup>†</sup> comprising the above lines.

.footnote[<sup>†</sup>Every `Dockerfile` is called exactly that. Only one `Dockerfile` is allowed per (sub) directory.]

--

Next, build your Docker image from this `Dockerfile` using the following shell command. I'm going to call my image `r_py` and give it the "4.0.0" version stamp (both choices being optional). **Important:** Make sure that your shell/terminal is in the same directory as the `Dockerfile`when you run this command.

```{bash}
# docker build --tag <name>:<version> <directory>
docker build --tag r_py:4.0.0 .
```

--

This will take a minute to pull everything in. But your `r_py` image with be ready for immediate deployment thereafter, and now includes Python and [reticulate](https://rstudio.github.io/reticulate/).

```{bash}
docker run -it --rm r_py:4.0.0
```

---

# Docker Hub: Share your Docker images

You can share your Dockerfiles and images in various ways.

- I sometimes provide Dockerfiles on the GitHub repo associated with a particular research project. This provides a convenient way for others to reproduce the same computing environment that I used for conducting my analysis. (Example [here](https://github.com/grantmcdermott/sceptic-priors#docker).)


The most popular way to share Docker images is by hosting them on [**Docker Hub**](https://hub.docker.com/).
- I'm not going to show you how to do that here. But the good news is that it's very straightforward. See [here](https://jsta.github.io/r-docker-tutorial/04-Dockerhub.html) for a quick walkthrough.

---
class: inverse, center, middle
Expand All @@ -670,6 +738,21 @@ name: share

---

# Prep: Stop all running containers


*This next section is all about sharing files and folders between your computer and a container. To avoid unexpected behaviour, it would be best to stop all running containers before continuing.*

```{bash}
docker stop $(docker ps -q)
```

--

*You're good to continue now...*

---

# Share files by mounting volumes

Each container runs in a sandboxed environment and cannot access other files and directories on your computer unless you give it explicit permission.
Expand Down Expand Up @@ -861,7 +944,7 @@ R users are spoilt, thanks to the Rocker Project. Easy to build our own Dockerfi
docker run -it --rm rocker/r-ver:4.0.0
```

(See next page for a list of key commands.)
*(See next slide for a list of key commands.)*

---

Expand All @@ -877,7 +960,7 @@ docker run -it --rm rocker/r-ver:4.0.0

- `docker ps` list of currently running containers

- `docker stop <containerids>` stop one or more running containers
- `docker stop <container-ids>` stop one or more running containers

- `docker images` list all installed images

Expand All @@ -900,14 +983,7 @@ docker run -it --rm rocker/r-ver:4.0.0
- [Using Docker for Data Science](https://www.robertmylesmcdonnell.com/content/posts/docker/) (Very thorough walkthrough, with a focus on composing your own Dockerfiles from scratch.)
- [ROpenSci Docker Tutorial](http://ropenscilabs.github.io/r-docker-tutorial) (Another detailed and popular tutorial, albeit outdated in parts.)

---
class: inverse, center, middle

# Next class: Google Compute Engine
<html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html>


```{r gen_pdf, include = FALSE, cache = FALSE, eval = FALSE}
```{r gen_pdf, include = FALSE, cache = FALSE, eval = TRUE}
infile = list.files(pattern = '.html')
pagedown::chrome_print(input = infile, timeout = 100)
```
Loading

0 comments on commit efae1ba

Please sign in to comment.