forked from KarthikTunga/impala
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pull new version #3
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The HAVE_PIPE2 is a variable that tracks whether a platform has the system function pipe2() present. This value was not propagated to the appropriate file that uses it, causing its value to always be 0, and the wrong branch to be taken at compile time. This fixes it by propagating the value to the file. Change-Id: I6cdc343da35a34be8d95fbea3543d080dbc1ec29 Reviewed-on: http://gerrit.cloudera.org:8080/7705 Reviewed-by: Henry Robinson <[email protected]> Tested-by: Impala Public Jenkins
The original test case accessed the 'pos' field of nested collections. The query results could vary when reloading the data because the order of items within a nested collection is not necessarily the same accross loads. This patch reformulates the test to avoid 'pos'. Change-Id: I32e47f0845da8b27652faaceae834e025ecff42a Reviewed-on: http://gerrit.cloudera.org:8080/7708 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Adds the following flag: -buffer_pool_clean_pages_limit ((Advanced) Limit on bytes of clean spilled pages that will be accumulated in the buffer pool. Specified as number of bytes ('<int>[bB]?'), megabytes ('<float>[mM]'), gigabytes ('<float>[gG]'), or percentage of the buffer pool limit ('<int>%'). Defaults to bytes if no unit is given..) type: string default: "10%" This helps prevent Impala accumulating excessive amounts of clean pages, which can cause some problems in practice: * The OS buffer cache is squeezed, reducing I/O performance from HDFS and potentially reducing the ability of the OS to absorb writes from Impala without blocking. * Impala process memory consumption can expand more than users or admins might expect. E.g. if one query is running with a mem_limit of 1GB, many people will be surprised if the process inflates to the full process limit of 100GB. Impala doesn't provide any guarantees except from staying within the process mem_limit, but this could be a surprising divergence from past behaviour. Observability: A new metric buffer-pool.clean-pages-limit is added. Testing: Added a backend test to directly test that clean pages are evicted. Ran in a loop to flush out any flakiness. Ran exhaustive tests. Change-Id: Ib6b687ab4bdddf07d9d6c997ff814aa3976042f9 Reviewed-on: http://gerrit.cloudera.org:8080/7653 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
The change to address IMPALA-5769 added periodic cleaning for minidumps, which got in the way of the other minidump tests. This change sets max_minidumps to the default value (9) for all tests to keep the cleanup thread from interfering, and then sets a smaller limit where needed. Change-Id: I977930ae87b8d4671a89c1e07ba76b12eb92fa55 Reviewed-on: http://gerrit.cloudera.org:8080/7716 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Addresses some comments from the CR: https://gerrit.cloudera.org/#/c/7630/8 Notably changes the name of initial_reservation_total_bytes to initial_reservation_total_claims and attempts to clarify comments. Change-Id: I1391d99ca15d2ebcaabd564fbe1242806be09f72 Reviewed-on: http://gerrit.cloudera.org:8080/7681 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool, which is where the constant length partial tuple data is written. We also make sure that boundary pool does not hold any tuple data of returned batches by always deep copying it to output batches. Testing: - Ran some tests locally on ASAN build. - Updated test_scanners_fuzz.py to make slightly more significant changes to the data files. This change was helpful for finding issues while developing this patch. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Reviewed-on: http://gerrit.cloudera.org:8080/7639 Reviewed-by: Taras Bobrovytsky <[email protected]> Tested-by: Impala Public Jenkins
There was a bug in the BE evaluation logic of the TupleIsNullPredicate which could lead to wrong results for certain plan shapes. A TupleIsNullPredicate should evaluate to true only if all specified tuples are NULL. This was always the intent of the FE and is also documented in the BE as the required behavior. Testing: - Added regression test - Core tests passed Change-Id: Id659f849a68d88cfe22c65dd1747dd6d6a916163 Reviewed-on: http://gerrit.cloudera.org:8080/7737 Reviewed-by: Matthew Jacobs <[email protected]> Tested-by: Impala Public Jenkins
Fix: The issue was caused because sessions could get closed some time after their timeout expires, because the session expiration thread only woke up every session-timeout/2 seconds. This fix changes the logic, now the session expiration thread will wakeup every 1 sec to check whether sessions can be expired. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Reviewed-on: http://gerrit.cloudera.org:8080/7709 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson <[email protected]>
The previous commit for IMPALA-5800 imported a slightly old version of squeasel.c from the upstream project. In particular, the strings used to choose the TLS version were incompatible with Impala's. This patch corrects the error. Testing: Add tests to make sure that the expected flags are supported by Squeasel. Change-Id: I03b76f2d3b3e2e6133e5c57a4f33ac315c889e15 Reviewed-on: http://gerrit.cloudera.org:8080/7752 Reviewed-by: Bharath Vissapragada <[email protected]> Reviewed-by: Sailesh Mukil <[email protected]> Tested-by: Impala Public Jenkins
The DCHECK is only valid if the scanner successfully processes the entire scan range. If the scanner is cancelled or hits an error in-between buffers, boundary_column_ may be non-empty. The fix is to move the DCHECK to the end of FinishScanRange(), where it only executes if the scan range is fully processed. Testing: Ran TestCancellation in a loop locally for an hour. Change-Id: I3d63f0b0c1ac82103a60b68b244156b9fbd575e0 Reviewed-on: http://gerrit.cloudera.org:8080/7756 Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins
Change-Id: I1e86ce13d1fdc73487b8067f3670ee73b9269366 Reviewed-on: http://gerrit.cloudera.org:8080/7767 Reviewed-by: Thomas Tauber-Marshall <[email protected]> Tested-by: Impala Public Jenkins
We were missing accounting for this, since it is part of the expected difference between query and process memory consumption. The identity that applies is is: buffers allocated from system = reservation + cached buffers - unused reservation Where "cached buffers" includes free buffers and buffers attached to clean pages. The reservation is accounted against queries and "buffers allocated from system" is accounted against the process MemTracker. Reporting this in a direct way required adding a MemTracker with a negative consumption consumption, which fortunately did not require any major changes to the MemTracker code. Example output when applied to buffer pool branch: Process: Limit=8.35 GB Total=579.18 MB Peak=590.41 MB Buffer Pool: Free Buffers: Total=268.25 MB Buffer Pool: Clean Pages: Total=172.25 MB Buffer Pool: Unused Reservation: Total=-8.25 MB Free Disk IO Buffers: Total=21.98 MB Peak=21.98 MB RequestPool=default-pool: Total=12.07 MB Peak=71.58 MB ... <snip> ... RequestPool=fe-eval-exprs: Total=0 Peak=4.00 KB Untracked Memory: Total=112.88 MB Testing: Added a basic test for MemTrackers with negative metrics. Change-Id: Idb1fa3110dc893321f9f4e8ced6b7ede12194dad Reviewed-on: http://gerrit.cloudera.org:8080/7380 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
This is implemented in the ResourceProfileBuilder to avoid duplicating the login in every plan node. Testing: Updated planner tests. Change-Id: I1e2853300371e31b13d81a763dbafb21709b16c4 Reviewed-on: http://gerrit.cloudera.org:8080/7703 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Rejects queries during admission control if: * the largest (across all backends) min buffer reservation is greater than the query mem_limit or buffer_pool_limit * the sum of the min buffer reservations across the cluster is larger than the pool max mem resources There are some other interesting cases to consider later: * every per-backend min buffer reservation is less than the associated backend's process mem_limit; the current admission control code doesn't know about other backend's proc mem_limits. Also reduces minimum non-reservation memory (IMPALA-5810). See the JIRA for experimental results that show this slightly improves min memory requirements for small queries. One reason to tweak this is to compensate for the fact that BufferedBlockMgr didn't count small buffers against the BlockMgr limit, but BufferPool counts all buffers against it. Testing: * Adds new test cases in test_admission_controller.py * Adds BE tests in reservation-tracker-test for the reservation-util code. Change-Id: Iabe87ce8f460356cfe4d1be4d7092c5900f9d79b Reviewed-on: http://gerrit.cloudera.org:8080/7678 Reviewed-by: Matthew Jacobs <[email protected]> Tested-by: Impala Public Jenkins
Change-Id: I9b1698ee423c0db419b5de74c3664c88294fff49 Reviewed-on: http://gerrit.cloudera.org:8080/7768 Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins
The join inversion optimisation did not factor in the degree of parallelism that the join executed with after inversion. In some cases this lead to bad decisions, e.g. executing a join on a single node instead of 20 nodes. This patch adds a more sophisticated cost model that factors degree of parallelism into the join inversion decision. The behaviour is unchanged if inversion does not change the degree of parallelism. Perf: Ran cluster TPC-H and TPC-DS benchmarks. Average changes were small: < 3%. Saw a mix of improvements and regressions. We were satisfied that the regressions were cases when the planner "got lucky" previously. E.g. on TPC-H Q2 a join was flipped to put lineitem on the left as a result of inaccurate cardinality estimates. Mostafa also ran a TPC-DS benchmark where the dimension tables were loaded with num_nodes=1 to minimise the number of files. We saw some huge speedups there on the unmodified queries, e.g. TPCDS-Q10 went from 291s to 32.25s. The worst percentage regression was Q50, which went from 1.61s to 2.4s and the worst absolute regression was Q72, which went from 694s to 874s (25%). Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e Reviewed-on: http://gerrit.cloudera.org:8080/7351 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Make the call to format() compatible with older versions of python (< 2.7), which expect an indices in the string being formatted, e.g. "{0} {1} {2}".format('foo', 'bar', 'baz'). Without the numbers, format() raises an exception. Tested by running this test suite using python 2.6.6. Before the patch, the tests failed. After the patch, they pass. Change-Id: I5384aaf83a6a1f3c7643ed9f15de2dba1a5913a5 Reviewed-on: http://gerrit.cloudera.org:8080/7761 Reviewed-by: David Knupp <[email protected]> Tested-by: Impala Public Jenkins
…eIPA FreeIPA is a piece of software that automates and simplifies management of MIT krb5, SSSD, some LDAP service, etc. FreeIPA configures a localauth plugin[1] in krb5.conf to map Kerberos principals to local usernames. In this configuration, Kudu daemons were failing to start up due to failure to map their own service principals back to a username. This is due to a number of issues: 1) FreeIPA distinguishes between service principals and user principals and doesn't store a 'uid' field in LDAP for service principals. Thus, when 'sssd' tries to map a service principal to a local unix user, it determines that there is no such user (ie getpwnam() fails). This is by design, best I can tell. 2) sssd's implementation of krb5_auth_to_localname[1] uses getpwnam to try to map the kerberos principal to the local username. Because of the above, it fails for service principals. 3) Prior to el7.3, ssd configures krb5 with 'enable_only = sssd' in the localauth plugin section. This means that if sssd fails to perform the mapping, it does not fall back to other mappings defined in krb5.conf (eg explicitly defined auth_to_local rules). See [2] 4) Even after 7.3, there is an additional bug in sssd which I just filed[3], which causes the fallback to still not work. Because of this, we're getting the KRB5_PLUGIN_NO_HANDLE error code back up at the Kudu layer. We already have our own fallback case for KRB5_LNAME_NO_TRANS, and it seems like we should just be handling PLUGIN_NO_HANDLE in the same way to workaround the above behavior. I tested this patch on a FreeIPA-configured system on el6.7. I was able to successfully start a master with a FreeIPA-provided keytab and authentication required, and use 'kudu table list' to authenticate to it. [1] https://github.com/SSSD/sssd/blob/master/src/krb5_plugin/sssd_krb5_localauth_plugin.c [2] https://bugzilla.redhat.com/show_bug.cgi?id=1297462 [3] https://pagure.io/SSSD/sssd/issue/3459 Change-Id: I7bc13b33053a73784350c9d30a3796a96d318c04 Reviewed-on: http://gerrit.cloudera.org:8080/7551 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> Reviewed-on: http://gerrit.cloudera.org:8080/7745 Reviewed-by: Michael Ho <[email protected]> Tested-by: Impala Public Jenkins
This change enforces the rule that constants under be/src/kudu directory should be named kFooBar and fixes the cases where we didn't adhere to this. This is needed to avoid conflicts when cherry-picking other changes. Change-Id: I7971659ef3152580d44d6ddfb18be7ebf41052c7 Reviewed-on: http://gerrit.cloudera.org:8080/7158 Reviewed-by: Adar Dembo <[email protected]> Tested-by: Kudu Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/7740 Reviewed-by: Henry Robinson <[email protected]> Tested-by: Impala Public Jenkins
In Coordinator::UpdateBackendExecStatus(), we check if the backend has already completed with BackendState::IsDone() and return without applying the update if so to avoid updating num_remaining_backends_ twice for the same completed backend. The problem is that the value of BackendState::IsDone() is updated by the call to BackendState::ApplyExecStatusReport() that comes after it, but these operations are not performed atomically, so if there are two simultaneous calls to UpdateBackendExecStatus(), they can both call IsDone(), both get 'false', and then proceed to erroneously both update num_remaining_backends_, hitting a DCHECK. This patch modifies ApplyExecStatusReport to return true iff this report transitioned the backend to a done status, and then only updates num_remaining_backends_ in this case, ensuring it is only updated once per backend. Testing: - Ran test_finst_cancel_when_query_complete 10,000 times without hitting the DCHECK (previously, it would hit about once per 300 runs). Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241 Reviewed-on: http://gerrit.cloudera.org:8080/7577 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
Change-Id: I1212de7ed4e1f93767e00d1f3245b02bb88fa015 Reviewed-on: http://gerrit.cloudera.org:8080/7774 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Adds support for a "max_row_size" query option that instructs Impala to reserve enough memory to process rows of the specified size. For spilling operators, the planner reserves enough memory to process rows of this size. The advantage of this compared to simply specifying larger values for min_spillable_buffer_size and default_spillable_buffer_size is that operators may be able to handler larger rows without increasing the size of all their buffers. The default value is 512KB. I picked that number because it doesn't increase minimum reservations *too* much even with smaller buffers like 64kb but should be large enough for almost all reasonable workloads. This is implemented in the aggs and joins using the variable page size support added to BufferedTupleStream in an earlier commit. The synopsis is that each stream requires reservation for one default-sized page per read and write iterator, and temporarily requires reservation for a max-sized page when reading or writing larger pages. The max-sized write reservation is released immediately after the row is appended and the max-size read reservation is released after advancing to the next row. The sorter and analytic simply use max-sized buffers for all pages in the stream. Testing: Updated existing planner tests to reflect default max_row_size. Added new planner tests to test the effect of the query option. Added "set" test to check validation of query option. Added end-to-end tests exercising spilling operators with large rows with and without spilling induced by SET_DENY_RESERVATION_PROBABILITY. Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 Reviewed-on: http://gerrit.cloudera.org:8080/7629 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
This patch adds a new command "rerun" and a shortcut "@" to impala-shell . Users can rerun a certain query by its index given by history command. A valid index is an integer in [1, history_length] or [-history_length, -1]. Negative values index history in reverse order. For example, "@1;" or "rerun 1;" reruns the first query shown in history and "@-1;" reruns the last query. The rerun command itself won't appear in history. The history index is 1-based and increasing. Old entries might be truncated when impala-shell starts, and the indexes will be realigned to 1, so the same index may refer to different commands among multiple impala-shell instances. Testing: A test case test_rerun is added to shell/test_shell_interactive.py Change-Id: Ifc28e8ce07845343267224c3b9ccb71b29a524d2 Reviewed-on: http://gerrit.cloudera.org:8080/7674 Reviewed-by: Sailesh Mukil <[email protected]> Tested-by: Impala Public Jenkins
Fixed a bug where impala crashes during execution of an aggregation query using nondeterministic grouping expressions. This happens when it tries to rebuild a spilled partition that can fit in memory and rows get re-hashed to a partition other than the spilled one due to the use of nondeterministic expressions. Testing: Added a query test to verify successful execution. Change-Id: Ibdb09239577b3f0a19d710b0d148e882b0b73e23 Reviewed-on: http://gerrit.cloudera.org:8080/7714 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
Currently, a file handle in the file handle cache will only be evicted if the cache reaches its capacity. This means that file handles can be retained for an indefinite amount of time. This is true even for files that have been deleted, replaced, or modified. Since a file handle maintains a file descriptor for local files, this can prevent the disk space from being freed. Additionally, unused file handles are wasted memory. This adds code to evict file handles that have been unused for longer than a specified threshold. A thread periodically checks the file handle cache to see if any file handle should be evicted. The threshold is specified by 'unused_file_handle_timeout_sec'; it defaults to 6 hours. This adds a test to custom_cluster/test_hdfs_fd_caching.py to verify the eviction behavior. Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e Reviewed-on: http://gerrit.cloudera.org:8080/7640 Reviewed-by: Matthew Jacobs <[email protected]> Tested-by: Impala Public Jenkins
Seems to have broken with some recent commits. Change-Id: I9c22e197662228158d7935ebfb12d9b3691eb499 Reviewed-on: http://gerrit.cloudera.org:8080/6151 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
The bug is that OutputAllBuild() called BufferedTupleStream::GetNext() while 'out_batch' still referenced data from the current page of the stream. When iterating over an unpinned stream, GetNext() sets the 'needs_deep_copy' flag when it hits the end of a page, so that the caller has an opportunity to flush or deep copy the data. On the next call to GetNext(), that page may be deleted or unpinned. The fix is to check whether the batch is at capacity before calling BTS::GetNext(). This issue was masked by not using the 'delete_on_read' mode of the stream, which would have freed the stream's buffers earlier and increased the odds of ASAN detecting the problem. Testing: Running TestTPCHJoinQueries.test_outer_joins() reliably reproduced this for me locally under ASAN. After the fix the problem does not reoccur. Change-Id: Ia14148499ddaec41c2e70fef5d53e5d06ea0538d Reviewed-on: http://gerrit.cloudera.org:8080/7772 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
This patch adds decimal type codegen support in text scanner. Currently codegen would be disabled if there is a decimal column. With this patch StringParser::StringToDecimal will be called in generated code. A new file util/string-parser.cc is created and linked into libUtil. This file contains proxy functions to StringToDecimal in ordered to keep StringToDecimal out of LLVM IR. In a benchmark query: > select l_quantity, l_extendedprice, l_discount, l_tax from biglineitem where l_quantity > 100.0; where biglineitem is tpch.lineitem repeated 6 times, the codegen version is 19% faster than non-codgen version in scanning, and 8% faster in query time. Codegen time in this simple case is 69ms. Simple performance tests show that putting the parser in libUtil instead of impala-sse.bc would reduce codegen time by 2/3 in cases where only one decimal column is parsed while the scanning time is nearly the same. Change-Id: Ia65820e969d59094dc92d912a5279fa90f6b179d Reviewed-on: http://gerrit.cloudera.org:8080/7683 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Sometimes the client is not open when the debug action fires at the start of Open() or Prepare(). In that case we should set the probability when the client is opened later. This caused one of the large row tests to start failing with a "failed to repartition" error in the aggregation. The error is a false positive caused by two distinct keys hashing to the same partition. Removing the check allows the query to succeed because the keys hash to different partitions in the next round of repartitioning. If we repeatedly get unlucky and have collisions, the query will still fail when it reaches MAX_PARTITION_DEPTH. Testing: Ran TestSpilling in a loop for a couple of hours, including the exhaustive-only tests. Change-Id: Ib26b697544d6c2312a8e1fe91b0cf8c0917e5603 Reviewed-on: http://gerrit.cloudera.org:8080/7771 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
In ParquetLevelDecoder::Init() for RLE encoding, we read the metadata size and advance the data buffer past it. If the metadata size is corrupted, it can cause us to incorrectly read past the end of the buffer. This patch checks that the metadata size is less than the total size of the buffer, and returns an error if it isn't. Testing: - Ran test_scanners_fuzz.py under ASAN 500 times without hitting the use-after-poison (previously it would usually hit in < 100 runs). Change-Id: I3f3d0d998f7581c7c935d98fde886f145efd61a8 Reviewed-on: http://gerrit.cloudera.org:8080/7769 Reviewed-by: Alex Behm <[email protected]> Reviewed-by: Matthew Jacobs <[email protected]> Tested-by: Impala Public Jenkins
Filter states indexed by FilterStats::ROW_GROUPS_KEY was only relevant if the runtime filter is bound by partitioning columns. This is no longer true since always-false filter can filter out row groups as well, which results in a crash in CheckForAlwaysFalse(). This patch registers row groups counters unconditionally to fix this bug. Change-Id: I3eda43845b78516c1e76e07e0d2dd9d862168e1d Reviewed-on: http://gerrit.cloudera.org:8080/8369 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Uses a thread pool to issue many compute stats commands in parallel to Impala, rather than doing it serially. Where it was obvious, I combined multiple stats commands into fewer, to reduce the number of "show databses" and serialized "show tables" commands. This speeds up the compute stats step in data loading significantly. My measurements for testdata/bin/compute-table-stats.sh running before and after this change, with the Impala daemons restarted (cold) or not restarted (warm) on an 8-core, 32GB RAM machine were: old, cold: 7m44s new, cold: 1m42s old, warm: 1m23s new, warm: 48s The data load in the full test build behaves in a cold fashion. It's typical for https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/ to run this compute stats step for 9 or 10 minutes. With this change, this will come down to about 2 minutes. Change-Id: Ifb080f2552b9dbe304ecadd6e52429214094237d Reviewed-on: http://gerrit.cloudera.org:8080/8354 Reviewed-by: David Knupp <[email protected]> Tested-by: Impala Public Jenkins
This commit loads functional-query, TPC-H data, and TPC-DS data in parallel. In parallel, these take about 37 minutes, dominated by functional-query. Serially, these take about 30 minutes more, namely the 13 minutes of tpcds and 16 minutes of tpcds. This works out nicely because CPU usage during data load is very low in aggregate. (We don't sustain more than 1 CPU of load, whereas build machines are likely to have many CPUs.) To do this, I added support to run-step.sh to have a notion of a backgroundable task, and support waiting for all tasks. I also increased the heapsize of our HiveServer2 server. When datasets were being loaded in parallel, we ran out of memory at 256MB of heap. The resulting log output is currently like so (but without the timestamps): 15:58:04 Started Loading functional-query data in background; pid 8105. 15:58:04 Started Loading TPC-H data in background; pid 8106. 15:58:04 Loading functional-query data (logging to /home/impdev/Impala/logs/data_loading/load-functional-query.log)... 15:58:04 Started Loading TPC-DS data in background; pid 8107. 15:58:04 Loading TPC-H data (logging to /home/impdev/Impala/logs/data_loading/load-tpch.log)... 15:58:04 Loading TPC-DS data (logging to /home/impdev/Impala/logs/data_loading/load-tpcds.log)... 16:11:31 Loading workload 'tpch' using exploration strategy 'core' OK (Took: 13 min 27 sec) 16:14:33 Loading workload 'tpcds' using exploration strategy 'core' OK (Took: 16 min 29 sec) 16:35:08 Loading workload 'functional-query' using exploration strategy 'exhaustive' OK (Took: 37 min 4 sec) I tested dataloading with the following command on an 8-core, 32GB machine. I saw 19GB of available memory during my run: ./buildall.sh -testdata -build_shared_libs -start_minicluster -start_impala_cluster -format Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac Reviewed-on: http://gerrit.cloudera.org:8080/8320 Reviewed-by: Jim Apple <[email protected]> Reviewed-by: Michael Brown <[email protected]> Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins
Before this patch, decimal operations would never produce an error. Division by and modulo zero would result in a NULL. In this patch, we change this behavior so that we raise an error instead of returning a NULL. We also modify the format of the decimal expr tests format to also include an error field. Testing: - Added several expr and end to end tests. Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Reviewed-on: http://gerrit.cloudera.org:8080/8344 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
The bug is that the pointer was a uint8_t*, which C++ iostreams print as a C-style string. The intent was to instead print the memory address, which happens for void*. To avoid the subtlety, instead convert to uintptr_t and print as hex. Change-Id: I89250646bf683dd2d636dcb37a66ceb0428af8b2 Reviewed-on: http://gerrit.cloudera.org:8080/8371 Reviewed-by: anujphadke <[email protected]> Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins
Testing: Added test case to verify that CopyRows in select node is successfully codegened. Improved test coverage for select node with limit. Performance: Queries used (num_nodes set to 1): 500 Predicates: select * from (select * from tpch_parquet.lineitem limit 6001215) t1 where l_partkey > 10 and l_extendedprice > 10000 and l_linenumber > 1 and l_comment >'foo0' .... and l_comment >'foo500' order by l_orderkey limit 10; 1 Predicate: select * from (select * from tpch_parquet.lineitem limit 6001215) t1 where l_partkey > 10 order by l_orderkey limit 10; +--------------+-----------------------------------------------------+ | | 500 Predicates | 1 Predicate | | +------------+-------------+------------+-------------+ | | After | Before | After | Before | +--------------+------------+-------------+------------+-------------+ | Select Node | 12s385ms | 1m1s | 234ms | 797ms | | Codegen time | 2s619ms | 1s962ms | 200ms | 181ms | +--------------+------------+-------------+------------+-------------+ Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e Reviewed-on: http://gerrit.cloudera.org:8080/8196 Reviewed-by: Bikramjeet Vig <[email protected]> Tested-by: Impala Public Jenkins
Dataload typically follows a pattern of loading data into a text version of a table, and then using an insert overwrite from the text table to populate the table for other file formats. This insert is always done in Impala for Parquet and Kudu. Otherwise it runs in Hive. Since Impala doesn't support writing nested data, the population of complextypes_fileformat tries to hack the insert to run in Hive by including it in the ALTER part of the table definition. ALTER runs immediately after CREATE and always runs in Hive. The problem is that ALTER also runs before the base table (functional.complextypes_fileformat) is populated. The insert succeeds, but it is inserting zero rows. This code change introduces a way to force the Parquet load to run using Hive. This lets complextypes_fileformat specify that the insert should happen in Hive and fixes the ordering so that the table is populated correctly. This is also useful for loading custom Parquet files into Parquet tables. Hive supports the DATA LOAD LOCAL syntax, which can read a file from the local filesystem. This means that several locations that currently use the hdfs commandline can be modified to use this SQL. This change speeds up dataload by a few minutes, as it avoids the overhead of the hdfs commandline. Any other location that could use DATA LOAD LOCAL is also switched over to use it. This includes the testescape* tables which now print the appropriate DATA LOAD commands as a result of text_delims_table.py. Any location that already uses DATA LOAD LOCAL is also switched to indicate that it must run in Hive. Any location that was doing an HDFS command in the LOAD section is moved to the LOAD_DEPENDENT_HIVE section. Testing: Ran dataload and core tests. Also verified that functional_parquet.complextypes_fileformat has rows. Change-Id: I7152306b2907198204a6d8d282a0bad561129b82 Reviewed-on: http://gerrit.cloudera.org:8080/8350 Reviewed-by: Joe McDonnell <[email protected]> Tested-by: Impala Public Jenkins
'Test case 16' in test_row_filters has been failing occasionaly on ASAN as the runtime filters are not generated within the specified RUNTIME_FILTER_WAIT_TIME_MS. The fix is to increase RUNTIME_FILTER_WAIT_TIME_MS. This patch updates all of the tests in test_row_filters to use the same timeout, which is set to a higher value for ASAN builds. Change-Id: Ia098735594b36a72f02bf7edd051171689618051 Reviewed-on: http://gerrit.cloudera.org:8080/8358 Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins
Change-Id: I2c261f32aa90aec67dbd1e5f172323e4c3feaa1d Reviewed-on: http://gerrit.cloudera.org:8080/8378 Reviewed-by: Jim Apple <[email protected]> Tested-by: Impala Public Jenkins
This adds a warning to impala-shell if -r/--refresh_after_connect is used, in anticipation of us removing the feature in a future version. Change-Id: Id297f80c0f596a69ef8ecde948812b82d2a5c0fa Reviewed-on: http://gerrit.cloudera.org:8080/8381 Reviewed-by: Philip Zeyliger <[email protected]> Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins
The tpcds-q22a.test test file has a comment before a "set" command. The regex used to match "set" commands does not handle preceding comments, which are part of the query statement. Testing: Ran the test with the below command and confirmed that DECIMAL_V2 was automatically set back to 0. impala-py.test tests/query_test/test_tpcds_queries.py -k 22a \ --capture=no Change-Id: Id549dd3369dd163f3b3c8fe5685a52e0e6b2d134 Reviewed-on: http://gerrit.cloudera.org:8080/8384 Reviewed-by: Michael Brown <[email protected]> Tested-by: Impala Public Jenkins
We may be seeing a race with errors like "java.io.FileNotFoundException: File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist". This reverts commit e020c37. Change-Id: I46da93f4315a5a4bdaa96fa464cb51922bd6c419 Reviewed-on: http://gerrit.cloudera.org:8080/8386 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Currently, the FE generates a number of runtime filters and assigns them to the single node plan without taking the value of RUNTIME_FILTER_MODE and DISABLE_ROW_RUNTIME_FILTERING query options into account. The backend then removes filters from exec nodes, based on the following rules: 1. If DISABLE_ROW_RUNTIME_FILTERING is set, filters are removed from the exec nodes that are marked as targets not bound by partition columns. 2. If RUNTIME_FILTER_MODE is set to LOCAL, filters are removed from the exec nodes that are marked as remote targets. This may cause some confusion to users because they may see runtime filters in the output of explain that are not applied when the query is executed. This change moves the logic of runtime filter pruning to the planner in the FE. The runtime filter assignment is done on the distributed plan and the above constraints are enforced there directly. Change-Id: Id0f0b200e02442edcad8df3979f652d66c6e52eb Reviewed-on: http://gerrit.cloudera.org:8080/7564 Tested-by: Impala Public Jenkins Reviewed-by: Alex Behm <[email protected]>
…oter size This commit introduces the compile-time constant READ_SIZE_MIN_VALUE in the newly created common/global-flags.h A static_assert checks if READ_SIZE_MIN_VALUE is greater than or equal to HdfsParquetScanner::FOOTER_SIZE. Also, moved the definition of read_size flag to global-flags.cc, and declared it in global-flags.h. Runtime checks test if read_size is greater than or eaqual to READ_SIZE_MIN_VALUE. If not, the process is terminated. Change-Id: Ib7364147e7daf5380f11368871f8479646b448da Reviewed-on: http://gerrit.cloudera.org:8080/8366 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Implements multi threaded block metadata loading on the Catalog server where we fetch block metadata for multiple partitions of a single table in parallel. Number of threads to load the metadata is controlled by the following two parameters (set on the Catalog server startup and applies for each table load) -max_hdfs_partitions_parallel_load(default=5) -max_nonhdfs_partitions_parallel_load(default=20) We use different thread pool sizes for HDFS and non-HDFS tables since non-HDFS supports much higher throughput of RPC calls for listStatus /listFiles. Based on our experiments, S3 showed a linear speed up (up to ~113x) with increasing number of loading threads where as the HDFS throughput was limited to ~5x in un-secure clusters and up to ~3.7x in secure clusters. We narrowed it down to scalability bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the server and the client side. One thing to note here is that the thread pool based metadata fetching is implemented only for loading HDFS block metadata and not for loading HMS partition information. Our experiments showed that while loading large partitioned tables, ~90% of the time is spent in connecting to NN and loading the HDFS block information and optimizing the rest ~10% makes the code unnecessarily complex without much gain. Additional notes: - The multithreading approach is implemented for * INVALIDATE (loading from scratch), * REFRESH (reusing existing md) code paths, * ALTER TABLE ADD/RECOVER PARTITIONS. - This patch makes the implementation of ListMap thread-safe since we use that data structure as a shared state between multiple partition metadata loding threads. Testing and Results: - This patch doesn't add any new tests since there is enough test coverage already. Passed core/exhaustive runs with HDFS/S3. - We noticed up to ~113x speedup on S3 tables(thread_pool_size=160) and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure HDFS clusters. - Synthesized the following two large tables on HDFS and S3 and noticed significant reduction in my test DDL queries. (1) 100K partitions + 1 million files (2) 80 partitions + 250K files 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4% 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25% 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITION I -23.57% 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87% 80-PARTITIONS-250K-FILES-09-INVALIDATE I -24.88% 80-PARTITIONS-250K-FILES-03-RECOVER I -35.90% 80-PARTITIONS-250K-FILES-07-REFRESH I -43.03% 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS I -43.93% 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV I -46.59% 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION I -48.71% 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH I -49.02% 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV I -49.05% 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87% 80-PARTITIONS-250K-FILES-S3-03-RECOVER I -67.17% 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV I -76.45% 80-PARTITIONS-250K-FILES-S3-07-REFRESH I -87.04% 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PART I -88.57% Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481 Reviewed-on: http://gerrit.cloudera.org:8080/8235 Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins
Impala currently kinits by forking off a child process. This has proved to be expensive in many cases since the subprocess tries to reserve as much memory as Impala is currently using which can be quite a lot. This patch adds a flag called 'use_kudu_kinit' that defaults to true. When it's true, it uses the Kudu security library's kinit code that programatically uses the krb5 library to kinit. When it's false, we run our current path which kicks off the kinit-thread and forks off a kinit process periodically to reacquire tickets based on FLAGS_kerberos_reinit_interval. Converted existing tests in thrift-server-test to run with and without kerberos. We now run this BE test with kerberos by using Kudu's MiniKdc utility. This introduces a new dependency on some kerberos binaries that are checked through FindKerberosPrograms.cmake. Note that this is only a test dependency and not a dependency for the impalad binaries and friends. Compilation will still succeed if the kerberos binaries for the MiniKdc are not found, however, the thrift-server-test will fail. We run with and without the 'use_kudu_kinit' flag. TODO: Since the setting up and tearing down of our security code isn't idempotent, we can run only any one test in a process with Kerberos now (IMPALA-6085). Updated bin/bootstrap_system.sh to install new sasl-gssapi modules and the kerberos binaries required for the MiniKdc. Also fixed a bug that didn't transfer the environment into 'sudo' in bin/bootstrap_system.sh. Testing: Verified with thrift-server-test and also manually on a live kerberized cluster. Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Reviewed-on: http://gerrit.cloudera.org:8080/7938 Reviewed-by: Sailesh Mukil <[email protected]> Tested-by: Impala Public Jenkins
This patch removes BufferedTupleStream::GetRows. This function pins a stream and reads all the rows into a single batch. It is not a good API since it creates an arbitrarily large row batch. In this patch the call sites pin the stream and then directly use GetNext to retrieve a single batch at a time. Testing: It passes existing tests. A test case for GetRows is removed. Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Reviewed-on: http://gerrit.cloudera.org:8080/8226 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
We don't know the root cause yet but try to improve things: * Eliminate one possible cause of flakiness - unfinished fragments left from previous queries. * Print a profile if an assertion fails so we can see why it failed. Testing: Ran core tests. Change-Id: Ic332dddd96931db807abb960db43b99e5fd0f256 Reviewed-on: http://gerrit.cloudera.org:8080/8403 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Release of backend resources for query execution on the coordinator daemon should occur eagerly as soon as query execution is finished or cancelled. Before this patch some resources managed by QueryState, like scratch files, were only released when the query was closed and all of the query's control structures were torn down. These resources are referred to as "ExecResources" in various places to distinguish them from resources associated with the client request (like the result cache) that are still required after the query finishes executing. This first change does not solve the admission control problem for two reasons: * We don't release the "admitted" memory on the coordinator until the query is unregistered. * Admission control still considers the memory reserved until the query memtracker is unregistered, which happens only when the QueryState is destroyed: see MemTracker::GetPoolMemReserved(). The flow is mostly similar to initial_reservation_refcnt_, except the coordinator also holds onto a reference count, which is released when either the final row is returned or cancellation is initiated. After the coordinator releases its refcount, the resources can be freed as soon as local fragments also release their refcounts. Also clean up Coordinator slightly by preventing runtime_state() from leaking out to ClientRequestState - instead it's possible to log the query MemTracker by following parent links in the MemTracker. This patch is partially based on Joe McDonnell's IMPALA-1575 patch. Testing: Ran core tests. Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Reviewed-on: http://gerrit.cloudera.org:8080/8303 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
HdfsAvroScanner::CodegenMaterializeTuple generates a function linear in size to the number of columns. On 1000 column tables, codegen time is significant. This commit roughly halves it for wide columns. (Note that this had been much worse in recent history (<= Impala 2.9).) It does so by breaking up MaterializeTuple() into multiple smaller functions, and then calls them in order. When breaking up into 200-column chunks, there is a noticeable speed-up. I've made the helper code for generating LLVM function prototypes have a mutable function name, so that the builder can be re-used multiple times. I've checked by inspecting optimized LLVM that in the case where there's only 1 helper function, code gets inlined so that there doesn't seem to be an extra function. I measured codegen time for various "step sizes." The case where there are no helper functions is about 2.7s. The best case was about a step size of 200, with timings of 1.3s. For the query "select count(int_col16) from functional_avro.widetable_1000_cols", codegen times as a function of step size are roughly as follows. This is averaged across 5 executions, and rounded to 0.1s. step time 10 2.4 50 2.5 75 2.9 100 3.0 125 3.0 150 1.4 175 1.3 200 1.3 <-- chosen step size 225 1.5 250 1.4 300 1.6 400 1.6 500 1.8 1000 2.7 The raw data was generated like so, with some code that let me change the step size at runtime: $(for step in 10 50 75 100 125 150 175 200 225 250 300 400 500 1000; do for try in $(seq 5); do echo $step > /tmp/step_size.txt; echo -n "$step "; impala-shell.sh -q "select count(int_col16) from functional_avro.widetable_1000_cols; profile;" 2> /dev/null | grep -A9 'CodeGen:(Total: [0-9]*s' -m 1 | sed -e 's/ - / /' | sed -e 's/([0-9]*)//' | tr -d '\n' | tr -s ' ' ' '; echo; done; done) | tee out.txt ... 200 CodeGen:(Total: 1s333ms, non-child: 1s333ms, % non-child: 100.00%) CodegenTime: 613.562us CompileTime: 605.320ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 38 NumInstructions: 8.44K OptimizationTime: 701.276ms PeakMemoryUsage: 4.12 MB PrepareTime: 10.014ms ... 1000 CodeGen:(Total: 2s659ms, non-child: 2s659ms, % non-child: 100.00%) CodegenTime: 558.860us CompileTime: 1s267ms LoadTime: 0.000ns ModuleBitcodeSize: 1.95 MB NumFunctions: 34 NumInstructions: 8.41K OptimizationTime: 1s362ms PeakMemoryUsage: 4.11 MB PrepareTime: 10.574ms I have run the core tests with this change. Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476 Reviewed-on: http://gerrit.cloudera.org:8080/8211 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Currently a "select *" query in test_inline_view_limit fails during exhaustive testing because Impala returns columns from HBase tables in a different order (IMPALA-886) than the one expected. This fix ensures the column order is consistent by specifying the output columns in the right order in the select query. Testing: Tested locally, with and without exhaustive exploration strategy. Change-Id: I11667872b8788a8b0040bf9252bf07b987b5d330 Reviewed-on: http://gerrit.cloudera.org:8080/8409 Reviewed-by: Lars Volker <[email protected]> Tested-by: Impala Public Jenkins
Impala would previously update the ddl time of a table when dropping a partition but not when adding one. This change removes updates to the ddl time when partitions are added or removed to be consistent with Hive. Additionally the check in the ddl update test would fail if some operations took longer than 20 seconds. Instead, this change makes sure that the ddl time increases as intended. To test this change I ran test_last_ddl_time_update in exhaustive mode and also ran a private S3 build. Change-Id: I3126252e7709304d3e1fa4bb06a0b847180bd6cf Reviewed-on: http://gerrit.cloudera.org:8080/8411 Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins
Coordinator::UpdateFilter wrongly assumed std::string::capacity() to be 0 after std::string::clear(), resulting in an DCHECK failure in memory tracking. This patch tracks size() instead of capacity() to fix the bug. Change-Id: Ia28f9eb8d0b3e795d798949a1cfcfeaaec7040ed Reviewed-on: http://gerrit.cloudera.org:8080/8410 Reviewed-by: Tim Armstrong <[email protected]> Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins
A recent patch for IMPALA-5129 introduced a use-after-free bug in thrift-server-test. It is fixed in this patch. Change-Id: I2cd434757de2cd384def5b360a479e51812cccca Reviewed-on: http://gerrit.cloudera.org:8080/8412 Reviewed-by: Sailesh Mukil <[email protected]> Tested-by: Impala Public Jenkins
Change-Id: I93e99aec2fcbc12f94678e60ebb9d150e72fc77d Reviewed-on: http://gerrit.cloudera.org:8080/8421 Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins
test_wait_time has been flaky recently on ASAN due to hitting a timeout. The fix is to increase the timeout for ASAN builds. Change-Id: Iee005bee8e0a535ce59d2e23e56be6004f2eb9de Reviewed-on: http://gerrit.cloudera.org:8080/8427 Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins
The approach is to re-materialize strings in those tuples that survive conjunct evaluation and may reference disk I/O buffers directly. This means that perf should not regress for the following cases: * Compressed Avro files. * Non-string columns. * Selective scans where the majority of tuples are filtered out. This approach will also work for the Sequence and Text scanners. Includes some improvements to Avro codegen to replace more constants to help win back some performance (with limited success): replaced InitTuple() with an optimised version and substituted tuple_byte_size() with a constant. Removes dead code for handling CHAR(n) - CHAR(n) is now always fixed length. Perf: Did microbenchmarks on uncompressed Avro files, one with all columns from lineitem and one with only l_comment. Tests were run with: set num_scanner_threads=1; I ran the query 5 times and extracted MaterializeTupleTime from the profile to measure CPU cost of materialization. Overall string materialization got significantly slower, mainly because of the extra memcpy() calls required. Selecting one string from a table with multiple columns: select min(l_comment) from biglineitem_avro 1.814 -> 2.096 Selecting one string from a table with one column: select min(l_comment) from biglineitem_comment; profile; 1.708 -> 3.7 Selecting one string from a table with one column with predicate: select min(l_comment) from biglineitem_comment where length(l_comment) > 10000; 1.691 -> 1.449 Selecting all columns: select min(l_orderkey), min(l_partkey), min(l_suppkey), min(l_linenumber), min(l_quantity), min(l_extendedprice), min(l_discount), min(l_tax), min(l_returnflag), min(l_linestatus), min(l_shipdate), min(l_commitdate), min(l_receiptdate), min(l_shipinstruct), min(l_shipmode), min(l_comment) from biglineitem_avro; profile; 2.335 -> 3.711 Selecting an int column (no strings): select min(l_linenumber) from biglineitem_avro 1.806 -> 1.819 Testing: Ran exhaustive tests. Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Reviewed-on: http://gerrit.cloudera.org:8080/8146 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
This is a revert of a revert, re-enabling parallel data load. It avoid the race condition by explicitly configuring the temporary directory in question in load-data.py. When the parallel data load change went in, we discovered a race with a signature of: java.io.FileNotFoundException: File /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist The number in this path is milliseconds since the epoch, and the race occurs when two queries submitted to HiveServer2, running with the local runner, hit the same millisecond time stamp. The upstream bug is https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which is now marked as a dupe). I've tested this by running data load 5 times on the same machines where it failed before. I also ran data load manually and inspected the system to make sure that the temporary directories are getting created as expected in /tmp/impala-data-load-*. Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10 Reviewed-on: http://gerrit.cloudera.org:8080/8405 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Setting a very large value could cause strange behaviour like crashing, hanging, excessive memory usage, spinning etc. This patch rejects values out of the range [0,65536]. Change-Id: Idd5a2490a73b6915224160d7604b4badc72c1d97 Reviewed-on: http://gerrit.cloudera.org:8080/8419 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.