In this chapter we will look at how to run a live system in a production environment and how to release your code for that system.
In order to have reliable 24/7 server we need to beable to release new features and bugfixes with as little disruption as possible to the running system. In this chapter we will learn about the tools provided by the runtime to handle releases.
The runtime comes with a very powerful set of primitives, tools and libraries for releasing and upgrading. In this chapter we will go over most of these primitives and then learn how to use some automated tools for building and deploying releases.
We will first look at how to start the Erlang runtime system in a live environment. Then we will look at the building blocks for release handling, before moving on to how to do releases in Elixir.
If you mostly use Elixir you might want to skip the information on the underlying Erlang implementation and jump to the section on Exrm (see xxx). There will probably come a time when you need to peek under Exrm and the Elixir abstractions and then you will need to know about how the underlying Erlang system handle releases. Fell free to come back then and read the following sections.
When running a live server there are a number of basic settings that
you need to get right. You do not want to start the system with a shell
attached, since the system then would go down when you close the shell.
Instead you want to start in detached mode and use one of the techniques
we learned in xxx to connect a shell to the system.
This is done by giving the flag -detached
to the erl
command.
In almost all circumstances you will also want to give a name to your
system so that it becomes a live distributed system so that you can
connect to other nodes and attach a remote shell to the node. As we saw
in the previous chapter this is done with either the flag -name
or
the flag -sname
.
In order to start your own application you also need to give at least
the module name of your application with the flag -s MODULE
.
The module init
will then start your application. We will look a
little bit more at what init does soon.
Imagine we have a small server like the one in example_server.erl
Then we can start our server node as:
> erl -detached -sname server01 -s example_server
Nice! Now we have a server running in the background. In a real world example we would probably have started this form an secure shell on a remote machine and now we can log out of the ssh session and keep our Erlang application running in the background.
One problem with this way of starting the node is that the output
to stdout
now is just redirected to /dev/null
. The tick printouts
in oue example is nowhere to be seen. This is the reason we want
to use run_erl
to get the output redirected to a log file.
We want two directories for output and logs that we create under /tmp/
for now.
> mkdir /tmp/erl_pipe
> mkdir /tmp/erl_log
> run_erl -daemon /tmp/erl_pipe/ /tmp/erl_log/ \
"exec erl -sname server01 -s example_server"
In a live server environment you might want to consider creating a
user which runs your server and then also create a directory for your
application under /var/log/
where your server can write its logs.
Note that when redirecting the output from the runtime with run_erl to
a pipe will cause a call to fsync
on each line of output. This is
often what you want, since there is not supposed to be much output to
standard out and standard error from the system and any such output
should not be lost on a crash. If your application is very chatty and
writes to these pipes all the time this might be costly, consider
rewriting your app so it writes to a proper log instead.
Now you can log in to your remote machine and start your system and then log out, go and take a break and a coffee, and your server will keep on running serving your customers, until it crashes.
You do not want to skip your coffee break just because your server crashes or hangs, fortunately the heart beat monitor Heart has you covered.
In Erlang you have supervisor that can monitor your processes, and outside the runtime you have Heart which is an external process that can monitor your node.
In most live environments you do want to use Heart, but there are some circumstances where Heart can cause problems. If your node takes a long time to restart, for example because it has to repair and load large Mnesia tables, and if the node sometime get so overloaded that the Heart ping goes unanswered. In this case the system might be just temporarily unresponsive but then Heart will kill the node and it will then be offline for the complete restart.
In most cases though, you do want to use Heart, which will act as an supervisor and restart the node if it hangs. By setting the environment variable HEART_COMMAND you can control how the system is restarted. You might want to use a separate function to restart the system from the one you use when you start the system.
We rewrite our server to save its state to disk and implement a restart function which reads the state from disk.
Now we need to set up the environment variable HEART_COMMAND
which
Heart use to restart the node if it hangs.
We want to use our new restart function when restarting:
export HEART_COMMAND='run_erl -daemon /tmp/erl_pipe/ /tmp/erl_log/ \
"exec erl -heart -sname server01 -s example_server2 restart"'
Now we can start our server:
run_erl -daemon /tmp/erl_pipe/ /tmp/erl_log/ \
"exec erl -heart -sname server01 -s example_server2"
We can now see that the counter is written to our save file, and if we kill the node it will be restarted and continue to write to the node.
> cat /tmp/example_server_counter.eterm
1.
> cat /tmp/example_server_counter.eterm
2.
> cat /tmp/example_server_counter.eterm
4.
> killall beam.smp
> cat /tmp/example_server_counter.eterm
20.
Taking a look at the log file will show the restart.
> cat /tmp/erl_log/erlang.log.1
=====
===== LOGGING STARTED Mon Feb 13 12:32:51 CET 2017
=====
heart_beat_kill_pid = 30009
Erlang/OTP 19 [erts-8.1] [source-0567896] [64-bit] [smp:10:10]
[async-threads:10] [hipe] [kernel-poll:false]
ERTS is starting in /home/happi/eserlang/Book/code
Eshell V8.1 (abort with ^G)
(server01@GDC08)1> Tick 0
Tick 1
Tick 2
Tick 3
Tick 4
Tick 5
Tick 6
Tick 7
Tick 8
Tick 9
Tick 10
Tick 11
Tick 12
Tick 13
Tick 14
Tick 15
Tick 16
heart: Mon Feb 13 12:33:09 2017: Erlang has closed.
heart: Mon Feb 13 12:33:09 2017: Executed "run_erl -daemon /tmp/erl_pipe/
/tmp/erl_log/ "exec erl -heart -sname server01
-s example_server2 restart"" -> 0. Terminating.
=====
===== LOGGING STARTED Mon Feb 13 12:33:09 CET 2017
=====
heart_beat_kill_pid = 30084
Erlang/OTP 19 [erts-8.1] [source-0567896] [64-bit] [smp:10:10]
[async-threads:10] [hipe] [kernel-poll:false]
ERTS is starting in /home/happi/eserlang/Book/code
Eshell V8.1 (abort with ^G)
(server01@GDC08)1> Tick 17
Tick 18
Tick 19
Tick 20
Tick 21
Tick 22
Tick 23
To stop a system monitored by heart you must either shut down the system
in a controlled way using for example init:stop
or you will have to kill
the heart process first for example with killall heart
.
Now we have a way to start our application and have it automatically restarted by Heart if it crashes, but what happens if the whole machine restarts. We do not want to stop our nice coffee break just to log in and restart the Erlang node. We want our application to start when the system start, also the whole command for starting is becoming a bit long. We want to wrap this into a script and add it as a service on the OS level. We will look at how to do this in the next chapter xxx.
Now that we can start and stop our node and run our server we are almost ready to make a release for our server. Before we do that we have to turn our server into a proper Erlang application though, since all release tools work on applications.
We will not go into the details of what an Erlang application is and
how to write your code as an application. Let us just assume that you
or at least the developer who wrote the server in the first place,
knew how to make an Erlang application and we have the following code.
(We could let Rebar3 generate these files for us: rebar3 new app example_server
. Then we just copy example_server.erl
into the src
directory.)
Now that we have an application that follows the OTP application format we can create a release for this application.
First we create a release file:
The release file lists which version of the Erlang runtime we want to
use and the applications and the versions we need. For a full
documentation of the release file see
erlang.org/doc/man/rel.html. You
can find the version numbers of the applications by starting Erlang
and calling the function application:which_applications().
. The
application need to be started in order for this function to list the
application so if you are starting a vanilla system without for
example sasl (System Architecture Support Libraries) and you want to
include that application you need to first start it
application:start(sasl).
Then you can list the version of the
application.
When the runtime starts it calls boot(BootArgs)
in the preloaded
module init
. The argument BootArgs
contains the command line
arguments used when starting the runtime. If theses arguments contains
-boot Name
init will execute the boot commands in the binary file
'Name.boot'. This file is generated from the human readable script
file Name.script
. The boot
file contains the commands that loads and starts all applications that
your system needs. Even if the boot script is human readable it is not
something you want to write by hand, instead we let systools
generate the boot script from the release file:
> erl
Erlang/OTP 19 [erts-8.1] [source-0567896] [64-bit] [smp:10:10]
[async-threads:10] [hipe] [kernel-poll:false]
Eshell V8.1 (abort with ^G)
1> systools:make_script("example_server", [local]).
ok
Now you should have two new files in your code directory,
example_server.script
and example_server.boot
.
With a boot script in place we do not need the flag -s Module
to tell the runtime where to start execution, instead we tell
the runtime which boot script to run. The boot script will then
start all the needed applications in the right order.
Now we can start Erlang with erl -sname server01 -boot example_server
In our tiny example with the example_server
it was quite straight forward to
write the release file by hand, but as your project grows the dependencies
will also grow quickly. Soon it will be very tedious to keep the release file
up to date. Fortunately there are a number of tools to help us with this.
There is a release tool that comes bundled with Erlang called
reltool
and
there is a good tutorial on how to use reltool
in "Learn you some
Erlang for great
good". One nice
thing with reltool
is that it contains a graphical user interface
where you can see dependencies between modules. Still, reltool
is
quite complex and far from fully automated. Instead most modern tools
and installations uses relx
.
Relx is also the tool used by modern Erlang build systems such as
erlang.mk
and Rebar3
. It is quite straight forward to generate a
release from Reabar3 if you have set up your build process with Rebar3
from the beginning. (With rebar3 new app example server
) All you need to do
is add the name of your release and the application in your release to
the rebar.config
file. (see the Rebar3
documentation) A minimal
rebar.config file could look like this:
{erl_opts, [debug_info]}.
{deps, []}.
{relx, [{release, {server01, "1.0.0"},
[example_server]}
]}.
Running rebar3 tar
would create a tarball that can be
copied to the target machine. This tarball also include
start scripts for starting and upgrading your server.
If you unpack the tarball you will get a bin directory
and in that directory you have an executable shell script
named as your release (server01
). Calling this script
as it is will start the runtime and your application with
an attached shell. As we noted before we want to start
the node in detached mode. By adding the configuration
{extended_start_script, true}
to the relx config in
rebar.config
you will get a start script that takes
an argument to let you start and stop the node.
{erl_opts, [debug_info]}.
{deps, []}.
{relx, [ {release, {server01, "1.0.0"},
[example_server]}
, {extended_start_script, true}
]}.
Now we can start the server in the background as:
> bin/server01 start
You can also stop the node and attach a shell to the node with
bin/server01 stop
and bin/server01 attach
respectively (but not in
that order of course).
[aside note The relx start scripts are somewhat fragile]
If you followed the example earlier in this book and added a printout in your `.erlang` file you will probably have to remove that printout in order to get the start scripts to work flawlessly.
[/aside]If you want to use Heart to monitor your node you need to supply the
command line parameters for the runtime in a file called vm.args
.
When you built the release Rebar3 created a default vm.args
in
_build/default/rel/server01/releases/1.0.0/vm.args
. The way easiest
way to supply your own version is to copy that file to
config/vm.args
and edit it. For example by adding the line -heart
somewhere in the file. It should then look something like:
Then you tell the build process in the relx
configuration to use this file by adding {vm_args, "config/vm.args"}
to rebar.config.
Now rebar.config looks like:
{erl_opts, [debug_info]}.
{deps, []}.
{relx, [ {release, {server01, "1.0.0"},
[example_server]}
, {extended_start_script, true}
, {vm_args, "config/vm.args"}
]}.
We will see that many of the performance tuning parameters that can
change the behavior of the runtime are passed to the virtual machine
as command line parameters. Putting these parameters in the vm.args
file will not only keep everything nicely in one (hopefully) source
controlled file, but it will also give you the opportunity to document
why you are setting this parameter.
This whole exercise might feel like quite a lot of work just to get a command to start your server. The thing is that we have really just scratched a little bit at the surface of this complexity and tried to take a very straight path through all the configuration files and settings. The reason for all this complexity is to make it possible to run Erlang in many different environments and still be able to do safe and controlled upgrades. It is actually possible to run Erlang on an tiny embedded device without a hard disk and do an upgrade over the network, as well as on a cluster of Windows machines.
If you do not want to deal with all this complexity right now there is an even easier way to do releases, by using the release tool for Elixir, the Elixir release manager - Exrm.
The Elixir Release Manager (Exrm) is built on top of relx so if you didn't skip all of the last chapter you should know a little bit about what it will produce. The nice thing with Exrm is that it uses your mix configuration to find out what your release should contain.
As long as you have created your Elixir program using
mix there is not much you need to do in order to create
a release with Exrm. You need to add Exrm as a dependency in
your mix.exs
file.
Imagine we start by creating an application with a supervisor which we let mix generate:
mix new --sup server02
We rewrite the server02 module to be a simple counter
We also rewrite application.ex
so it starts the new server:
Now all we need to do is to add a dependency on Exrm (that is {:exrm, []}
) in our mix.exs
file:
Now we can build the release tool and the release:
> mix do deps.get, deps.compile
...
> mix release
Now you should have a tarball with the release in
rel/server02/releases/0.1.0/server02.tar.gz
.
That's it, adding one line "{:exrm, []}
" to your mix.exs
is all
that is needed to make an Elixir release.
The main goal with all this release handling has been to make it easy to upgrade our running server without disturbing the service, in the next chapter we will look at how to achieve that.
In this chapter we learned how to start the runtime system without a shell and how to tell the runtime what code to run when it starts. You can either do this by giving the name of a module to start as an argument on the command line or you can provide a boot script which can start several applications in the correct order.
We also learned about how to monitor a node with Heart and how to specify how Heart restarts the system. Then we learned about how to bundle the applications that we need into a release and how to create a release more or less automatically.
We learned how the runtime uses boot scripts to load applications and
how to create them with systools
. Then we looked at relx
which automates
much of the tedious work you have to do with systools
. Finally we saw that
the Elixir release manager uses relx
and mix
to automate even more
of the release configuration process.
Your mission, should you choose to accept it, is now to try to activate the heartbeat monitor Heart for the Elixir release created in the last example.
A hint: there should be a file vm.args
somewhere which defines the
arguments given to the runtime at start up. If you can find out how it
is generated by Exrm you should be able to add the flag -heart
to it
and then Heart should be started when you start the system with
server02 start
.