Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster example out of date #111

Open
candlerb opened this issue Apr 3, 2018 · 9 comments
Open

Cluster example out of date #111

candlerb opened this issue Apr 3, 2018 · 9 comments

Comments

@candlerb
Copy link
Contributor

candlerb commented Apr 3, 2018

The cluster example uses command line flags which are no longer valid:

  • Error: unknown flag: --debug
  • Error: unknown flag: --log-dir (should be --data-dir ?)
  • Error: unknown flag: --prometheus-addr
  • Error: unknown flag: --serf-members (should be --join or --join-wan ?)

So I tried running it like this:

./jocko broker --data-dir=/tmp/jocko0 --broker-addr=127.0.0.1:9001 --raft-addr=127.0.0.1:9002 --serf-addr=127.0.0.1:9003 --id=1 >broker0.out 2>&1 &
./jocko broker --data-dir=/tmp/jocko1 --broker-addr=127.0.0.1:9101 --raft-addr=127.0.0.1:9102 --serf-addr=127.0.0.1:9103 --join=127.0.0.1:9003 --id=2 >broker1.out 2>&1 &
./jocko broker --data-dir=/tmp/jocko2 --broker-addr=127.0.0.1:9201 --raft-addr=127.0.0.1:9202 --serf-addr=127.0.0.1:9203 --join=127.0.0.1:9003 --id=3 >broker2.out 2>&1 &

These options are accepted; but no broker is listening on ports 9001, 9101 or 9201, nor is serf listening on 9003, 9103 or 9203.

# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:9002          0.0.0.0:*               LISTEN      1413/jocko
tcp        0      0 127.0.0.1:9102          0.0.0.0:*               LISTEN      1422/jocko
tcp        0      0 127.0.0.1:9202          0.0.0.0:*               LISTEN      1430/jocko
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      436/sshd
tcp6       0      0 :::8301                 :::*                    LISTEN      1413/jocko
tcp6       0      0 :::36275                :::*                    LISTEN      1413/jocko
tcp6       0      0 :::22                   :::*                    LISTEN      436/sshd
# netstat -naup
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
udp        0      0 127.0.0.1:51041         127.0.0.1:6831          ESTABLISHED 1413/jocko
udp        0      0 0.0.0.0:68              0.0.0.0:*                           250/dhclient
udp        0      0 127.0.0.1:41643         127.0.0.1:6831          ESTABLISHED 1430/jocko
udp        0      0 127.0.0.1:43866         127.0.0.1:6831          ESTABLISHED 1422/jocko
udp6       0      0 :::8301                 :::*                                1413/jocko

Captured output:

==> broker0.out <==
2018/04/03 20:58:05 Initializing logging reporter
2018-04-03T20:58:05.431Z	INFO	jocko/broker.go:107	hello	{"id": 1, "broker addr": "", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "id": 0, "raft addr": "127.0.0.1:9002"}
2018/04/03 20:58:05 [INFO] raft: Initial configuration (index=0): []
2018/04/03 20:58:05 [INFO] raft: Node at 127.0.0.1:9002 [Follower] entering Follower state (Leader: "")
2018/04/03 20:58:05 [INFO] serf: EventMemberJoin: builder ::
2018/04/03 20:58:05 [WARN] serf: Failed to re-join any previously known node
2018-04-03T20:58:05.528Z	INFO	jocko/serf.go:66	adding LAN server	{"id": 1, "broker addr": "", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "id": 0, "raft addr": "127.0.0.1:9002", "meta": {"ID":0,"Name":"","Bootstrap":false,"Expect":0,"NonVoter":false,"Status":1,"RaftAddr":"127.0.0.1:9002","SerfLANAddr":"%!b(string=127.0.0.1:9003):8301","BrokerAddr":"127.0.0.1:9001"}}
2018-04-03T20:58:05.529Z	INFO	jocko/server.go:71	hello	{"id": 1, "broker addr": "", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "node id": 0, "addr": "127.0.0.1:9001"}
2018/04/03 20:58:06 [WARN] raft: no known peers, aborting election

==> broker1.out <==
2018/04/03 20:58:43 Initializing logging reporter
2018-04-03T20:58:43.325Z	INFO	jocko/broker.go:107	hello	{"id": 2, "broker addr": "", "serf addr": "127.0.0.1:9103", "raft addr": "127.0.0.1:9102", "id": 0, "raft addr": "127.0.0.1:9102"}

==> broker2.out <==
2018/04/03 20:59:13 Initializing logging reporter
2018-04-03T20:59:13.172Z	INFO	jocko/broker.go:107	hello	{"id": 3, "broker addr": "", "serf addr": "127.0.0.1:9203", "raft addr": "127.0.0.1:9202", "id": 0, "raft addr": "127.0.0.1:9202"}

I tried running the first process under strace. Here are all the lines matching htons:

connect(4, {sa_family=AF_INET, sin_port=htons(6831), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
getsockname(4, {sa_family=AF_INET, sin_port=htons(59060), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
getpeername(4, {sa_family=AF_INET, sin_port=htons(6831), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
bind(6, {sa_family=AF_INET, sin_port=htons(9002), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
getsockname(6, {sa_family=AF_INET, sin_port=htons(9002), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
bind(10, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
bind(11, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
bind(10, {sa_family=AF_INET6, sin6_port=htons(8301), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(10, {sa_family=AF_INET6, sin6_port=htons(8301), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
bind(11, {sa_family=AF_INET6, sin6_port=htons(8301), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(11, {sa_family=AF_INET6, sin6_port=htons(8301), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
bind(12, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(12, {sa_family=AF_INET6, sin6_port=htons(45202), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0

(I don't see any attempt to open ports 9001 or 9003?)

Here are the lines matching = -1:

access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188013312, u64=140576467611392}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 4, 0xc420055a8c) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188013312, u64=140576467611392}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 4, 0xc42005599c) = -1 EPERM (Operation not permitted)
newfstatat(AT_FDCWD, "/etc/mdns.allow", 0xc4200209f8, 0) = -1 ENOENT (No such file or directory)
epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188013312, u64=140576467611392}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 4, 0xc420055654) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188012896, u64=140576467610976}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 8, 0xc42019b0a4) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188012896, u64=140576467610976}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 9, 0xc42019afec) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188012896, u64=140576467610976}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 9, 0xc42019abb4) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2188012896, u64=140576467610976}}) = -1 EPERM (Operation not permitted)
epoll_ctl(5, EPOLL_CTL_DEL, 9, 0xc42019afe4) = -1 EPERM (Operation not permitted)

The EPERM issues are a bit worrying. Maybe this is a symptom of running within an lxd container (but then again, running in a docker container is supposed to work)

@candlerb
Copy link
Contributor Author

Just tried it again. Unfortunately I cannot get either a single-node or multi-node setup running.

Single node

# ./jocko broker
2018/06/16 21:06:36 Initializing logging reporter
2018-06-16T21:06:36.095Z        INFO    jocko/broker.go:109     hello   {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "0.0.0.0:9094", "raft addr": "127.0.0.1:9093", "id": 0, "raft addr": "127.0.0.1:9093"}
2018/06/16 21:06:36 [INFO] raft: Initial configuration (index=0): []
2018/06/16 21:06:36 [INFO] raft: Node at 127.0.0.1:9093 [Follower] entering Follower state (Leader: "")
2018/06/16 21:06:36 [INFO] serf: EventMemberJoin: jocko ::
2018-06-16T21:06:36.138Z        INFO    jocko/server.go:71      hello   {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "0.0.0.0:9094", "raft addr": "127.0.0.1:9093", "server id": 0, "addr": "0.0.0.0:9092"}
2018-06-16T21:06:36.139Z        INFO    jocko/serf.go:74        adding LAN server       {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "0.0.0.0:9094", "raft addr": "127.0.0.1:9093", "id": 0, "raft addr": "127.0.0.1:9093", "meta": {"ID":0,"Name":"","Bootstrap":false,"Expect":0,"NonVoter":false,"Status":1,"RaftAddr":"127.0.0.1:9093","SerfLANAddr":"0.0.0.0:9094:8301","BrokerAddr":"0.0.0.0:9092"}}
2018/06/16 21:06:37 [WARN] raft: no known peers, aborting election

I spy something dubious there: "SerfLANAddr":"0.0.0.0:9094:8301"

(8301 exists in the source code as DefaultLANSerfPort)

In another screen I try to create a topic:

# ./jocko topic create --topic test
error code: not controller

Back in the broker screen I see:

2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:152681c6a74b2a88:58a4a97d0a5b8bfe:1
2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:29f53d9eec662474:58a4a97d0a5b8bfe:1
2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:3ab651b5a79db080:58a4a97d0a5b8bfe:1
2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:6215d9d3872fa0c8:58a4a97d0a5b8bfe:1
2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:6cd74c36fd162c6:58a4a97d0a5b8bfe:1
2018/06/16 21:06:56 Reporting span 58a4a97d0a5b8bfe:58a4a97d0a5b8bfe:0:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:48fb487a39f54e3:8304d1cf33b493f:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:3119779192068843:8304d1cf33b493f:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:14c0ac963bda4e32:8304d1cf33b493f:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:61ef62b06b1e53f3:8304d1cf33b493f:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:41950f95b4c41d52:8304d1cf33b493f:1
2018/06/16 21:07:12 Reporting span 8304d1cf33b493f:8304d1cf33b493f:0:1
2018/06/16 21:07:13 ERROR: error when flushing the buffer: write udp 127.0.0.1:60767->127.0.0.1:6831: write: connection refused

I don't know what's supposed to be listening on port 6831; this number doesn't appear in the Jocko source code anywhere. And indeed nothing is listening on this port, although jocko has a connected UDP socket to send to 6831:

# netstat -naup
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
udp        0      0 127.0.0.1:60767         127.0.0.1:6831          ESTABLISHED 15769/jocko
udp        0      0 0.0.0.0:68              0.0.0.0:*                           256/dhclient
udp6       0      0 :::8301                 :::*                                15769/jocko

Cluster

The instructions in _examples/cluster/README.md have invalid flags. I changed them to:

# ./jocko broker \
          --data-dir="/tmp/jocko0" \
          --broker-addr=127.0.0.1:9001 \
          --raft-addr=127.0.0.1:9002 \
          --serf-addr=127.0.0.1:9003 \
          --id=1
2018/06/16 21:23:41 Initializing logging reporter
2018-06-16T21:23:41.179Z        INFO    jocko/broker.go:109     hello   {"id": 1, "broker addr": "127.0.0.1:9001", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "id": 1, "raft addr": "127.0.0.1:9002"}
2018/06/16 21:23:41 [INFO] raft: Initial configuration (index=0): []
2018/06/16 21:23:41 [INFO] raft: Node at 127.0.0.1:9002 [Follower] entering Follower state (Leader: "")
2018/06/16 21:23:41 [INFO] serf: EventMemberJoin: jocko ::
2018-06-16T21:23:41.223Z        INFO    jocko/server.go:71      hello   {"id": 1, "broker addr": "127.0.0.1:9001", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "server id": 1, "addr": "127.0.0.1:9001"}
2018-06-16T21:23:41.226Z        INFO    jocko/serf.go:74        adding LAN server       {"id": 1, "broker addr": "127.0.0.1:9001", "serf addr": "127.0.0.1:9003", "raft addr": "127.0.0.1:9002", "id": 1, "raft addr": "127.0.0.1:9002", "meta": {"ID":1,"Name":"","Bootstrap":false,"Expect":0,"NonVoter":false,"Status":1,"RaftAddr":"127.0.0.1:9002","SerfLANAddr":"127.0.0.1:9003:8301","BrokerAddr":"127.0.0.1:9001"}}
2018/06/16 21:23:43 [WARN] raft: no known peers, aborting election

(Note: same problem with SerfLANAddr having two ports)

In another screen, trying to add a second broker:

# ./jocko broker \
          --data-dir="/tmp/jocko1" \
          --broker-addr=127.0.0.1:9101 \
          --raft-addr=127.0.0.1:9102 \
          --serf-addr=127.0.0.1:9103 \
          --join=127.0.0.1:9003 \
          --id=2
2018/06/16 21:24:31 Initializing logging reporter
2018-06-16T21:24:31.488Z        INFO    jocko/broker.go:109     hello   {"id": 2, "broker addr": "127.0.0.1:9101", "serf addr": "127.0.0.1:9103", "raft addr": "127.0.0.1:9102", "id": 2, "raft addr": "127.0.0.1:9102"}
2018/06/16 21:24:31 [INFO] raft: Initial configuration (index=0): []
2018/06/16 21:24:31 [INFO] raft: Node at 127.0.0.1:9102 [Follower] entering Follower state (Leader: "")
error starting broker: Failed to create memberlist: Could not set up network transport: Failed to start TCP listener on "127.0.0.1:9103" port 8301: listen tcp :8301: bind: address already in use
# 

This one exits because it tries to bind to 8301; that port is already in use by the first process.

Analysis

It seems that meta.SerfLANAddr / serf_lan_addr is assembled from SerfLANConfig.MemberlistConfig.BindAddr and SerfLANConfig.MemberlistConfig.BindPort

$ grep -R serf_lan_addr .
./jocko/leader.go:				"serf_lan_addr": meta.SerfLANAddr,
./jocko/metadata/metadata.go:		SerfLANAddr: m.Tags["serf_lan_addr"],
./jocko/serf.go:	config.Tags["serf_lan_addr"] = fmt.Sprintf("%s:%d", b.config.SerfLANConfig.MemberlistConfig.BindAddr, b.config.SerfLANConfig.MemberlistConfig.BindPort)

However, BindAddr defaults to both address and port:

./cmd/jocko/main.go:	brokerCmd.Flags().StringVar(&brokerCfg.SerfLANConfig.MemberlistConfig.BindAddr, "serf-addr", "0.0.0.0:9094", "Address for Serf to bind on") // TODO: can set addr alone or need to set bind port separately?

And BindPort defaults to 8301, and AFAICS cannot be overridden.

$ grep -R DefaultLANSerfPort .
./jocko/config/config.go:	DefaultLANSerfPort = 8301
./jocko/config/config.go:	conf.SerfLANConfig.MemberlistConfig.BindPort = DefaultLANSerfPort

... although in the test suite, it is set explicitly:

./jocko/testing.go:	config.SerfLANConfig.MemberlistConfig.BindPort = ports[2]
./jocko/testing.go:		s1.config.SerfLANConfig.MemberlistConfig.BindPort)
./testutil/testutil.go:	config.SerfLANConfig.MemberlistConfig.BindPort = ports[1]

I can't see how this can possibly work outside the test suite.

What I can do is force --serf-addr=127.0.0.1 at which point at least we don't have duplicate ports in SerfLANAddr:

# ./jocko broker --serf-addr=127.0.0.1
2018/06/16 21:31:42 Initializing logging reporter
2018-06-16T21:31:42.136Z        INFO    jocko/broker.go:109     hello   {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "127.0.0.1", "raft addr": "127.0.0.1:9093", "id": 0, "raft addr": "127.0.0.1:9093"}
2018/06/16 21:31:42 [INFO] raft: Initial configuration (index=0): []
2018/06/16 21:31:42 [INFO] raft: Node at 127.0.0.1:9093 [Follower] entering Follower state (Leader: "")
2018/06/16 21:31:42 [INFO] serf: EventMemberJoin: jocko 127.0.0.1
2018-06-16T21:31:42.170Z        INFO    jocko/server.go:71      hello   {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "127.0.0.1", "raft addr": "127.0.0.1:9093", "server id": 0, "addr": "0.0.0.0:9092"}
2018-06-16T21:31:42.175Z        INFO    jocko/serf.go:74        adding LAN server       {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "127.0.0.1", "raft addr": "127.0.0.1:9093", "id": 0, "raft addr": "127.0.0.1:9093", "meta": {"ID":0,"Name":"","Bootstrap":false,"Expect":0,"NonVoter":false,"Status":1,"RaftAddr":"127.0.0.1:9093","SerfLANAddr":"127.0.0.1:8301","BrokerAddr":"0.0.0.0:9092"}}
2018/06/16 21:31:42 [WARN] serf: Failed to re-join any previously known node
2018/06/16 21:31:43 [WARN] raft: no known peers, aborting election

However it still fails in the same same way as single node cluster (client says error code: not controller; broker fails writing to UDP port 6831)

@candlerb
Copy link
Contributor Author

P.S. Looking in the source code of serf itself, it uses a helper to split addr:port into the separate components of MemberlistConfig

func (c *Command) setupAgent(config *Config, logOutput io.Writer) *Agent {
        bindIP, bindPort, err := config.AddrParts(config.BindAddr)
...
        serfConfig.MemberlistConfig.BindAddr = bindIP
        serfConfig.MemberlistConfig.BindPort = bindPort

And I found port 6831 in jaeger-client-go. Since this is for OpenTracing, the failure to send to this UDP port may not matter. It would of course be nice to turn off when not needed.

./vendor/github.com/uber/jaeger-client-go/transport_udp.go:const defaultUDPSpanServerHostPort = "localhost:6831"

@candlerb
Copy link
Contributor Author

candlerb commented Jun 19, 2018

Working on fixes in #133 / #134

@justone
Copy link

justone commented Jun 21, 2018

Thanks for working on this. I ran into the same issue with the ports conflicting.

Let me know if I can help with testing or code review.

@candlerb
Copy link
Contributor Author

Current status: you can start a one-node cluster with jocko broker --bootstrap --bootstrap-expect=1, and create a topic with jocko topic create --topic <name>.

When I try to publish a message with confluent-kafka-python, it fails with the following error:

%3|1529616101.104|PROTOERR|rdkafka#producer-1| [thrd:main]: localhost:9092/bootstrap: Protocol parse failure at 31/70 (rd_kafka_parse_Metadata:306) (incorrect broker.version.fallback?)
%3|1529616101.104|PROTOERR|rdkafka#producer-1| [thrd:main]: localhost:9092/bootstrap: 65536 topics: tmpabuf memory shortage
%4|1529616101.104|METADATA|rdkafka#producer-1| [thrd:main]: localhost:9092/bootstrap: Metadata request failed: connected: Local: Bad message format (1ms): Permanent

You can start multiple nodes with e.g. --bootstrap-expect=3, but the cluster won't come up because the --join option currently does nothing. (I still haven't worked out why jocko needs both raft and serf. Maybe it's to allow a cluster where only a subset of nodes store the raft commit log?)

@candlerb
Copy link
Contributor Author

Cluster startup now kind-of working: serf needs to have a unique node name, so I added a JOCKONODENAME environment variable to override it. This is something which should rarely be used, so I didn't make it a command line flag.

There seems to be a problem with negative message transit times (!)

2018/06/22 08:04:21 [DEBUG] serf: messageJoinType: jocko1
2018/06/22 08:04:21 [DEBUG] serf: messageJoinType: jocko1
2018/06/22 08:04:21 [DEBUG] serf: messageJoinType: jocko1
2018/06/22 08:04:21 [ERR] serf: Rejected coordinate from jocko0: round trip time not in valid range, duration -7.035µs is not a positive value less than 10s

And the client still has to know which node to connect to:

# cmd/jocko/jocko topic create --topic weeble --broker-addr 127.0.0.1:9201
error code: not controller
# cmd/jocko/jocko topic create --topic weeble --broker-addr 127.0.0.1:9101
error code: not controller
# cmd/jocko/jocko topic create --topic weeble --broker-addr 127.0.0.1:9001
created topic: weeble

@travisjeffery
Copy link
Owner

@candlerb thanks for the PRs, merged them. you need both serf and raft cause they do different things, serf does discovery and raft does consensus. right

@candlerb
Copy link
Contributor Author

candlerb commented Jun 22, 2018

By "discovery" do you mean discovery of which nodes are members of the raft cluster, to avoid having to statically configure peers? I wasn't sure that a gossip protocol was suitable for that.

UPDATE: I have moved this discussion to #140

@candlerb
Copy link
Contributor Author

After latest push on branch candlerb/serfaddr (pull request #136), metadata response now works. Next problem is when publishing to a topic:

2018-06-23T21:17:07.518Z        ERROR   jocko/broker.go:427     produce to partition failed     {"id": 0, "broker addr": "0.0.0.0:9092", "serf addr": "0.0.0.0:9094", "raft addr": "127.0.0.1:9093", "id": 0, "raft addr": "127.0.0.1:9093", "error": "no replica for topic mytopic partition 0"}
github.com/travisjeffery/jocko/log.(*logger).Error
        /root/go/src/github.com/travisjeffery/jocko/log/logger.go:38
github.com/travisjeffery/jocko/jocko.(*Broker).handleProduce
        /root/go/src/github.com/travisjeffery/jocko/jocko/broker.go:427
github.com/travisjeffery/jocko/jocko.(*Broker).Run
        /root/go/src/github.com/travisjeffery/jocko/jocko/broker.go:146

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants