forked from apache/pulsar
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use consistent hashing in KeyShared distribution (apache#6791)
### Motivation The current implementation of KeyShared subscriptions uses a mechanism to divide they hash space across the available consumers. This is based on dividing the currently assigned hash ranges when a new consumer joins or leaves. There are few problems with the current approach: 1. When adding a new consumer, the bigger range is split to make space for the new consumer. That means that when adding 3 consumers, 1 of them will "own" a hash range that is double in size compared to the other 2 consumers and consequently it will receive twice the traffic. This is not terrible, but not ideal either. 2. When removing consumers, the range for the removed consumer will always be assigned to the next consumer. The new hash distribution really depends on the sequence upon which the consumers are removed. If one is unlucky, the traffic will be very heavily skewed having situations where 1 consumer is getting >90% of the traffic. This is an example of removing consumers in sequence, with attached the size of their respective hash ranges: ``` Removed consumer from rangeMap: {c1=8192, c10=4096, c3=4096, c4=8192, c5=4096, c6=8192, c7=16384, c8=8192, c9=4096} Removed consumer from rangeMap: {c1=8192, c10=4096, c4=8192, c5=4096, c6=12288, c7=16384, c8=8192, c9=4096} Removed consumer from rangeMap: {c1=8192, c10=4096, c5=4096, c6=12288, c7=16384, c8=16384, c9=4096} Removed consumer from rangeMap: {c1=8192, c10=8192, c6=12288, c7=16384, c8=16384, c9=4096} Removed consumer from rangeMap: {c1=24576, c10=8192, c6=12288, c7=16384, c9=4096} Removed consumer from rangeMap: {c1=24576, c10=8192, c7=28672, c9=4096} Removed consumer from rangeMap: {c1=53248, c10=8192, c9=4096} ``` As you can see, `c1` will take most of the traffic. Most likely it will not be able to process all the messages and the backlog builds up. ### Modifications * No functional difference from user perspective * Use consistent hashing mechanism to assign keys to consumers. This will ensure even distribution without the degradation in the corner cases. * Number of points in the ring is configurable, default=100. * Refactored current unit test. The test are currently duplicating the logic of the implementation and checking the a messages is placed on the bucket for one consumer. Of course it works, since it's the same code executed on both sides. But, instead, the test should focus on the contract of the feature: message should arrive in order, there should be "decent" sharing of load across consumers. * @codelipenghui I've removed the `selectByIndex()`. In my opinion there's absolutely no difference in efficiency/performance as I've also explained on apache#6647 (comment). I'm happy to discuss more about it.
- Loading branch information
Showing
16 changed files
with
451 additions
and
840 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
109 changes: 109 additions & 0 deletions
109
...ain/java/org/apache/pulsar/broker/service/ConsistentHashingStickyKeyConsumerSelector.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
/** | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.pulsar.broker.service; | ||
|
||
import java.util.Collections; | ||
import java.util.Map; | ||
import java.util.NavigableMap; | ||
import java.util.TreeMap; | ||
import java.util.concurrent.locks.ReadWriteLock; | ||
import java.util.concurrent.locks.ReentrantReadWriteLock; | ||
|
||
import org.apache.pulsar.broker.service.BrokerServiceException.ConsumerAssignException; | ||
import org.apache.pulsar.common.util.Murmur3_32Hash; | ||
|
||
/** | ||
* This is a consumer selector based fixed hash range. | ||
* | ||
* The implementation uses consistent hashing to evenly split, the | ||
* number of keys assigned to each consumer. | ||
*/ | ||
public class ConsistentHashingStickyKeyConsumerSelector implements StickyKeyConsumerSelector { | ||
|
||
private final ReadWriteLock rwLock = new ReentrantReadWriteLock(); | ||
|
||
// Consistent-Hash ring | ||
private final NavigableMap<Integer, Consumer> hashRing; | ||
|
||
private final int numberOfPoints; | ||
|
||
public ConsistentHashingStickyKeyConsumerSelector(int numberOfPoints) { | ||
this.hashRing = new TreeMap<>(); | ||
this.numberOfPoints = numberOfPoints; | ||
} | ||
|
||
@Override | ||
public void addConsumer(Consumer consumer) throws ConsumerAssignException { | ||
rwLock.writeLock().lock(); | ||
try { | ||
// Insert multiple points on the hash ring for every consumer | ||
// The points are deterministically added based on the hash of the consumer name | ||
for (int i = 0; i < numberOfPoints; i++) { | ||
String key = consumer.consumerName() + i; | ||
int hash = Murmur3_32Hash.getInstance().makeHash(key.getBytes()); | ||
hashRing.put(hash, consumer); | ||
} | ||
} finally { | ||
rwLock.writeLock().unlock(); | ||
} | ||
} | ||
|
||
@Override | ||
public void removeConsumer(Consumer consumer) { | ||
rwLock.writeLock().lock(); | ||
try { | ||
// Remove all the points that were added for this consumer | ||
for (int i = 0; i < numberOfPoints; i++) { | ||
String key = consumer.consumerName() + i; | ||
int hash = Murmur3_32Hash.getInstance().makeHash(key.getBytes()); | ||
hashRing.remove(hash, consumer); | ||
} | ||
} finally { | ||
rwLock.writeLock().unlock(); | ||
} | ||
} | ||
|
||
@Override | ||
public Consumer select(byte[] stickyKey) { | ||
return select(Murmur3_32Hash.getInstance().makeHash(stickyKey)); | ||
} | ||
|
||
@Override | ||
public Consumer select(int hash) { | ||
rwLock.readLock().lock(); | ||
try { | ||
if (hashRing.isEmpty()) { | ||
return null; | ||
} | ||
|
||
Map.Entry<Integer, Consumer> ceilingEntry = hashRing.ceilingEntry(hash); | ||
if (ceilingEntry != null) { | ||
return ceilingEntry.getValue(); | ||
} else { | ||
return hashRing.firstEntry().getValue(); | ||
} | ||
} finally { | ||
rwLock.readLock().unlock(); | ||
} | ||
} | ||
|
||
Map<Integer, Consumer> getRangeConsumer() { | ||
return Collections.unmodifiableMap(hashRing); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.