Skip to content

Commit

Permalink
[Issue-5994]: Start proxy pods when at least one broker pod is running (
Browse files Browse the repository at this point in the history
apache#6158)

### Motivation
Fixes apache#5994:
If the proxy service comes up before the brokers are up and reachable there will be HTTP 403 when running `bin/pulsar-admin` commands from inside the proxy pod.
 
The proxy will also not be able to connect to the brokers when data is pushed through binary port with the following error:
```bash
Caused by: org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
	... 14 more
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
22:11:07.633 [pulsar-web-32-6] INFO  org.eclipse.jetty.server.RequestLog - 172.17.0.6 - - [24/Jan/2020:22:11:07 +0000] "PUT /admin/v2/persistent/public/functions/assignments HTTP/1.1" 500 2528 "-" "Pulsar-Java-v2.5.0" 280
```

#### Workaround:
Restart the proxy pods once brokers pods are running

#### Proposed solution:
Hold off starting of the proxies until at least one broker is reachable in the cluster. 

### Modifications

Changes are inside `proxy-deployment.yaml` helm template file that defines a new init container before proxy is started. The init container waits until broker is reachable using the nslookup on the broker service with a sleep of 30 seconds between retries and up to number of brokers times.

Alternative solution that doesn't always work was `'until nslookup broker-service; sleep 2; done;', but 403 would still sometimes (could have been a fluke, but I saw it happening once).

### Verifying this change
1) Follow the instructions on how deploying helm and run:
`helm install pulsar --values pulsar/values-mini.yaml ./pulsar/`. 
2) Wait until all the services are up and running.  
3) Connect to proxy pod and run `bin/pulsar-admin broker-stats monitoring-metrics` - no 403 or permission errors should arise
4) Set up tenant, namespace
5) Push data into a topic - No errors in the proxy logs and client is able to push data into cluster through proxies
  • Loading branch information
roman-popenov authored Feb 1, 2020
1 parent b8760d5 commit b838c59
Showing 1 changed file with 16 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ spec:
terminationGracePeriodSeconds: {{ .Values.proxy.gracePeriod }}
initContainers:
# This init container will wait for zookeeper to be ready before
# deploying the bookies
# deploying the proxies
- name: wait-zookeeper-ready
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
Expand All @@ -86,6 +86,21 @@ spec:
until bin/pulsar zookeeper-shell -server {{ template "pulsar.fullname" . }}-{{ .Values.zookeeper.component }} ls /admin/clusters/{{ template "pulsar.fullname" . }}; do
sleep 3;
done;
# This init container will wait for at least one broker to be ready before
# deploying the proxy
- name: wait-broker-ready
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: ["bash", "-c"]
args:
- >-
for i in {0..{{ .Values.broker.replicaCount }}}; do
brokerServiceNumber="$(nslookup -timeout=10 {{ template "pulsar.fullname" . }}-{{ .Values.broker.component }} | grep Name | wc -l)"
if [[ ${brokerServiceNumber} -ge 1 ]]; then
break
fi
sleep 30;
done;
containers:
- name: "{{ template "pulsar.fullname" . }}-{{ .Values.proxy.component }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
Expand Down

0 comments on commit b838c59

Please sign in to comment.