Skip to content

Commit

Permalink
KUDU-2686 python: remove multiprocessing
Browse files Browse the repository at this point in the history
The python multiprocessing library doesn't play very nicely when
creating Pools after initializing complex state (e.g. the KuduClient).
Because of how it forks, the library may even copy lock states, and lead
to odd situations, like multiple threads waiting on a lock that isn't
held by any thread in the process.

This was the case in KUDU-2686, which resulted in a hang in
test_scantoken.py. Upon inspection[1], there appeared to be multiple
threads in a process waiting on the same sys_futex, but none of them
held it. [2] tips some pieces to make it more reproducible (tested on
Ubuntu 14.04, where the issue was first reported).

Starting with python3.4, there is supposedly a way around this, which is
to use a different process-startup pattern, though this isn't available
in python2.

This patch removes usage of multiprocessing entirely. While it can be
argued that multiprocessing provides extra test coverage, given that
multiprocessing is known to have such issues[3][4], that this isn't the
first time we've been bitten by this forking issue, that it's test-only,
and that scan tokens are tested in the C++ client as well, "dumbing"
down the test doesn't seem unreasonable.

[1] https://gist.github.com/andrwng/d2d21c551362ddd564926c2a4ec406ae
[2] https://gist.github.com/andrwng/cc6c211c62b1235cc58944d513ba6655
[3] pytest-dev/pytest#958
[4] https://codewithoutrules.com/2018/09/04/python-multiprocessing/

Change-Id: Ia9aa91191d54801731da27e5f132b3c96af0efa1
Reviewed-on: http://gerrit.cloudera.org:8080/12494
Tested-by: Kudu Jenkins
Reviewed-by: Jordan Birdsell <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
  • Loading branch information
andrwng committed Feb 15, 2019
1 parent 8f6d2f9 commit d76422c
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 18 deletions.
17 changes: 4 additions & 13 deletions python/kudu/tests/test_scantoken.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
from kudu.tests.util import TestScanBase
from kudu.tests.common import KuduTestBase
import kudu
from multiprocessing import Pool
import datetime
import time

Expand All @@ -44,21 +43,13 @@ def setUp(self):

def _subtest_serialize_thread_and_verify(self, tokens, expected_tuples, count_only=False):
"""
Given the input serialized tokens, spawn new threads,
execute them and validate the results
Given the input serialized tokens, hydrate the scanners, execute the
scans, and validate the results
"""
input = [(token.serialize(), self.master_hosts, self.master_ports)
for token in tokens]

# Begin process pool
pool = Pool(len(input))
try:
results = pool.map(_get_scan_token_results, input)
finally:
pool.close()
pool.join()
input = [(token.serialize(), self.master_hosts, self.master_ports) for token in tokens]

# Validate results
results = [_get_scan_token_results(i) for i in input]
actual_tuples = []
for result in results:
actual_tuples += result
Expand Down
5 changes: 0 additions & 5 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,6 @@
import re
import subprocess

# Workaround a Python bug in which multiprocessing's atexit handler doesn't
# play well with pytest. See http://bugs.python.org/issue15881 for details
# and this suggested workaround (comment msg170215 in the thread).
import multiprocessing

if Cython.__version__ < '0.21.0':
raise Exception('Please upgrade to Cython 0.21.0 or newer')

Expand Down

0 comments on commit d76422c

Please sign in to comment.