Skip to content

Commit

Permalink
Cleaning up documentation around ha.server setting
Browse files Browse the repository at this point in the history
Making ha.server default to find a free port between 6001 and 6011
instead of 6361 and 6371 which overlaps with the backup port range
Updated HA tutorial to reflect those changes and general cleanup
  • Loading branch information
Mark Needham committed Jul 17, 2013
1 parent d98af1b commit 9d4d3e2
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 30 deletions.
87 changes: 59 additions & 28 deletions enterprise/ha/src/docs/dev/ha-setup-tutorial.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,59 +2,90 @@
High Availability setup tutorial
================================

This guide will help you understand the ways to configure and deploy a Neo4j High Availability cluster.
This guide will help you understand how to configure and deploy a Neo4j High Availability cluster.
Two scenarios will be considered:
The first will be configuring 3 instances to be deployed on 3 separate machines, in a setting similar to what might be encountered in a production environment.
The second will modify the former to make it possible to run a cluster of 3 instances on the same physical machine, a setup most useful during development of applications on top of Neo4j.
o Configuring 3 instances to be deployed on 3 separate machines, in a setting similar to what might be encountered in a production environment.
- Modifying the former to make it possible to run a cluster of 3 instances on the same physical machine, which is particularly useful during development.

== Background ==

A Neo4j HA cluster consists of a set of running Neo4j Enterprise instances.
Each instance must be assigned an integer ID, which serves as its identifier in the cluster.
Each instance in a Neo4j HA cluster must be assigned an integer ID, which serves as its unique identifier. At startup, a Neo4j
instance contacts the other instances specified in the +ha.initial_hosts+ configuration option.

At startup, a Neo4j instance contacts the other instances specified in the +ha.initial_hosts+ configuration option.
When it establishes a connection to any one of these, it determines the current state of the cluster and ensures that it is eligible to join the cluster.
To be eligible the Neo4j instance must be hosting the same database store as other members of the cluster (although it is allowed to be in an older state), or be a new deployment without a database store.
When an instance establishes a connection to any other, it determines the current state of the cluster and ensures that
it is eligible to join. To be eligible the Neo4j instance must host the same database store as other members of the
cluster (although it is allowed to be in an older state), or be a new deployment without a database store.

First, let's explain what settings exist and what values they accept.
[WARNING]
.Explicitly configure IP Addresses/Hostnames for a cluster
=========
Neo4j will attempt to configure IP addresses for itself in the absence of explicit configuration. However in
typical operational environments where machines have multiple network cards and support IPv4 and IPv6 it is *strongly*
recommended that the operator explicitly sets the IP address/hostname configuration for each machine in the cluster.
=========

Let's examine the available settings and the values they accept.

=== ha.server_id

+ha.server_id+ is the cluster identification for each instance. It must be a positive integer, unique among all
Neo4j instances in the cluster.
+ha.server_id+ is the cluster identifier for each instance. It must be a positive integer and must be unique among
all Neo4j instances in the cluster.

For example, +ha.server_id=1+.

=== ha.cluster_server

+ha.cluster_server+ is an address/port setting that specifies where the Neo4j instance will listen for cluster
communications. This is related with the +ha.initial_hosts+ setting. The default is +0.0.0.0:5001+, which is
suitable for most deployments.

=== ha.server
communications (like hearbeat messages). The default port is +5001+. In the absence of a specified IP address, Neo4j
will attempt to find a valid interface for binding. While this behavior typically results in a well-behaved server, it
is *strongly* recommended that users explicitly choose an IP address bound to the network interface of their choosing
to ensure a coherent cluster deployment.

+ha.server+ is an additional address/port setting that specifies where the Neo4j instance will listen for content. It
must be different from ha.cluster_server, typically with a different port. The default is +0.0.0.0:6361+, which is
suitable for most deployments.
For example, +ha.cluster_server=192.168.33.22:5001+ will listen for cluster communications on the network interface
bound to the 192.168.33.0 subnet on port 5001.

=== ha.initial_hosts

+ha.initial_hosts+ is a comma separated list of hostname/port pairs, which specify how to reach other Neo4j instances
+ha.initial_hosts+ is a comma separated list of address/port pairs, which specify how to reach other Neo4j instances
in the cluster (as configured via their +ha.cluster_server+ option). These hostname/ports will be used when the Neo4j
instances starts, to allow it up to find and join the cluster. Note the specifying the instances own address is
permitted.
instances starts, to allow it up to find and join the cluster. Specifying an instance's own address is permitted.

[WARNING]
====
Do *not* use any whitespace in this configuration option.
====

For example, +ha.initial_hosts=192.168.33.22:5001,192.168.33.21:5001+ will attempt to reach Neo4j instances listening on
192.168.33.22 on port 5001 and 192.168.33.21 on port 5001 on the 192.168.33.0 subnet.

=== ha.server

+ha.server+ is an address/port setting that specifies where the Neo4j instance will listen for transactions
(changes to the graph data) from the cluster master. The default port is +6001+. In the absence of a specified IP address, Neo4j will attempt
to find a valid interface for binding. While this behavior typically results in a well-behaved server, it is *strongly* recommended that
users explicitly choose an IP address bound to the network interface of their choosing to ensure a coherent cluster topology.

+ha.server+ must user a different port to +ha.cluster_server+.

For example, +ha.server=192.168.33.22:6001+ will listen for cluster communications on the network interface
bound to the 192.168.33.0 subnet on port 6001.

[TIP]
.Address/port format
==================
The options +ha.cluster_server+ and +ha.server+ are specified as +<IP address>:<port>+. The IP address MUST be the
address assigned to one of the servers network interfaces, or the value +0.0.0.0+, which will cause Neo4j to listen
on every network interface.
The +ha.cluster_server+ and +ha.server+ configuration options are specified as +<IP address>:<port>+.
For +ha.server+ the IP address MUST be the address assigned to one of the host's network interfaces.
For +ha.cluster_server+ the IP address MUST be the address assigned to one of the host's network interfaces,
or the value +0.0.0.0+, which will cause Neo4j to listen on every network interface.
Either the address or the port can be omitted, in which case the default for that part will be used. If the address
is omitted, then the port must be preceeded with a colon (eg. +:5001+).
is omitted, then the port must be preceded with a colon (eg. +:5001+).
The port can also be configured as a range, like so: +<hostname>:<first port>[-<second port>]+. In this case, Neo4j
will test each port in sequence, and select the first that is unused. Note that this usage is not permitted when the
hostname is specified as +0.0.0.0+ (the "all interfaces" address).
The syntax for setting the port range is: +<hostname>:<first port>[-<second port>]+. In this case, Neo4j will test
each port in sequence, and select the first that is unused. Note that this usage is not permitted when the hostname is specified
as +0.0.0.0+ (the "all interfaces" address).
==================

== Getting started: Setting up a production cluster ==
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public class HaSettings
setting( "ha.max_concurrent_channels_per_slave", INTEGER, "20", min( 1 ) );

@Description( "Where to bind High Availability protocol server" )
public static final Setting<HostnamePort> ha_server = setting( "ha.server", HOSTNAME_PORT, ":6361-6371" );
public static final Setting<HostnamePort> ha_server = setting( "ha.server", HOSTNAME_PORT, ":6001-6011" );

@Description("Whether this instance should only participate as slave in cluster. If enabled it will never be elected as master")
public static final Setting<Boolean> slave_only = setting( "ha.slave_only", BOOLEAN, Settings.FALSE );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ online_backup_server=127.0.0.1:6362


# Uncomment and specify these lines for running Neo4j in High Availability mode.
# See the High availability setup tutorial for more details on these settings
# http://docs.neo4j.org/chunked/stable/ha-setup-tutorial.html

# ha.server_id is a unique integer for each instance of the Neo4j database in
# the cluster (as opposed to the coordinator instance IDs).
Expand All @@ -50,7 +52,7 @@ online_backup_server=127.0.0.1:6362
# cluster members, so different members can have different communication ports.
# Avoid localhost due to IP resolution issues on some systems.
# Addresses w/o host (:<port>) will use the localhost IP.
#ha.server=0.0.0.0:6001
#ha.server=:6001

# IP and port for this instance to bind to, for communicating cluster
# information with the rest of the instances. This will be communicated to the
Expand Down

0 comments on commit 9d4d3e2

Please sign in to comment.