These 70 commits are when the Protocol Buffers files have changed:
| Commit: | 31c6e2a | |
|---|---|---|
| Author: | Andrew Sherman | |
| Committer: | Jenkins | |
IMPALA-7985: Port RemoteShutdown() to KRPC. The :shutdown command is used to shutdown a remote server. The common case is that a user specifies the impalad to shutdown by specifying a host e.g. :shutdown('host100'). If a user has more than one impalad on a remote host then the form :shutdown('<host>:<port>') can be used to specify the port by which the impalad can be contacted. Prior to IMPALA-7985 this port was the backend port, e.g. :shutdown('host100:22000'). With IMPALA-7985 the port to use is the KRPC port, e.g. :shutdown('host100:27000'). Shutdown is implemented by making an rpc call to the target impalad. This changes the implementation of this call to use KRPC. To aid the user in finding the KRPC port, the KRPC address is added to the /backends section of the debug web page. We attempt to detect the case where :shutdown is pointed at a thrift port (like the backend port) and print an informative message. Documentation of this change will be done in IMPALA-8098. Further improvements to DoRpcWithRetry() will be done in IMPALA-8143. For discussion of why it was chosen to implement this change in an incompatible way, see comments in https://issues.apache.org/jira/browse/IMPALA-7985. TESTING Ran all end-to-end tests. Enhance the test for /backends in test_web_pages.py. In test_restart_services.py add a call to the old backend port to the test. Some expected error messages were changed in line with what KRPC returns. Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e Reviewed-on: http://gerrit.cloudera.org:8080/12260 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The documentation is generated from this commit.
| Commit: | 5825509 | |
|---|---|---|
| Author: | Thomas Tauber-Marshall | |
| Committer: | Jenkins | |
IMPALA-4555: Make QueryState's status reporting more robust QueryState periodically collects runtime profiles from all of its fragment instances and sends them to the coordinator. Previously, each time this happens, if the rpc fails, QueryState will retry twice after a configurable timeout and then cancel the fragment instances under the assumption that the coordinator no longer exists. We've found in real clusters that this logic is too sensitive to failed rpcs and can result in fragment instances being cancelled even in cases where the coordinator is still running. This patch makes a few improvements to this logic: - When a report fails to send, instead of retrying the same report quickly (after waiting report_status_retry_interval_ms), we wait the regular reporting interval (status_report_interval_ms), regenerate any stale portions of the report, and then retry. - A new flag, --status_report_max_retries, is introduced, which controls the number of failed reports that are allowed before the query is cancelled. --report_status_retry_interval_ms is removed. - Backoff is used for repeated failed attempts, such that for a period between retries of 't', on try 'n' the actual timeout will be t * n. Testing: - Added a test which results in a large number of failed intermediate status reports but still succeeds. Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971 Reviewed-on: http://gerrit.cloudera.org:8080/12049 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
| Commit: | 382b4de | |
|---|---|---|
| Author: | Andrew Sherman | |
| Committer: | Jenkins | |
IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (for example because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Reviewed-on: http://gerrit.cloudera.org:8080/12142 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
| Commit: | 8e84bf2 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Laszlo Gaal | |
Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. ==C5_APPROVED_BUGFIX== Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Reviewed-on: http://gerrit.cloudera.org:8080/11331 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> (cherry picked from commit dcb53473dd2c1e4ecf375af81ca3f9e2f61ead9f) Reviewed-on: https://gerrit.sjc.cloudera.com/36787 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com> Reviewed-on: https://gerrit.sjc.cloudera.com/39276 Tested-by: Laszlo Gaal <laszlo.gaal@cloudera.com> Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
| Commit: | 2afb2f0 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Jenkins | |
IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states when multiple fragment instances send them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Reviewed-on: http://gerrit.cloudera.org:8080/11615 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
| Commit: | 56fb619 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Jenkins | |
IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe Reviewed-on: http://gerrit.cloudera.org:8080/10855 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
| Commit: | 6390769 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Michael Ho | |
Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. ==C5_APPROVED_BUGFIX== Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Reviewed-on: http://gerrit.cloudera.org:8080/11331 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> (cherry picked from commit dcb53473dd2c1e4ecf375af81ca3f9e2f61ead9f) Reviewed-on: https://gerrit.sjc.cloudera.com/36787 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>
| Commit: | dcb5347 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Jenkins | |
Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Reviewed-on: http://gerrit.cloudera.org:8080/11331 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
| Commit: | 9de081a | |
|---|---|---|
| Author: | Lars Volker | |
| Committer: | Jenkins | |
IMPALA-7006: Remove KRPC folders Change-Id: Ic677484c27ed18b105da0a6b0901df4eb9f248e6 Reviewed-on: http://gerrit.cloudera.org:8080/10756 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Lars Volker <lv@cloudera.com>
| Commit: | 2a6d08d | |
|---|---|---|
| Author: | Lars Volker | |
| Committer: | Jenkins | |
IMPALA-7006: Add KRPC folders from kudu@334ecafd cp -a ~/checkout/kudu/src/kudu/{rpc,util,security} be/src/kudu/ Change-Id: I232db2b4ccf5df9aca87b21dea31bfb2735d1ab7 Reviewed-on: http://gerrit.cloudera.org:8080/10757 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Lars Volker <lv@cloudera.com>
| Commit: | d17722e | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Agnes Tevesz | |
IMPALA-6685: Improve profiles in KrpcDataStreamRecvr and KrpcDataStreamSender This change implements a couple of improvements to the profiles of KrpcDataStreamRecvr and KrpcDataStreamSender: - track pending number of deferred row batches over time in KrpcDataStreamRecvr - track the number of bytes dequeued over time in KrpcDataStreamRecvr - track the total time deferred RPCs queues are not empty - track the number of bytes sent from KrpcDataStreamSender over time - track the total amount of time spent in KrpcDataStreamSender, including time spent waiting for RPC completion. Sample profile of an Exchange node instance: EXCHANGE_NODE (id=21):(Total: 2s284ms, non-child: 64.926ms, % non-child: 2.84%) - ConvertRowBatchTime: 44.380ms - PeakMemoryUsage: 124.04 KB (127021) - RowsReturned: 287.51K (287514) - RowsReturnedRate: 125.88 K/sec Buffer pool: - AllocTime: 1.109ms - CumulativeAllocationBytes: 10.96 MB (11493376) - CumulativeAllocations: 562 (562) - PeakReservation: 112.00 KB (114688) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 112.00 KB (114688) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns Dequeue: BytesDequeued(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 700.00 KB, 2.00 MB, 3.49 MB, 4.39 MB, 5.86 MB, 6.85 MB - FirstBatchWaitTime: 0.000ns - TotalBytesDequeued: 6.85 MB (7187850) - TotalGetBatchTime: 2s237ms - DataWaitTime: 2s219ms Enqueue: BytesReceived(500.000ms): 0, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 328.73 KB, 963.79 KB, 1.64 MB, 2.09 MB, 2.76 MB, 3.23 MB DeferredQueueSize(500.000ms): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0 - DispatchTime: (Avg: 108.593us ; Min: 30.525us ; Max: 1.524ms ; Number of samples: 281) - DeserializeRowBatchTime: 8.395ms - TotalBatchesEnqueued: 281 (281) - TotalBatchesReceived: 281 (281) - TotalBytesReceived: 3.23 MB (3387144) - TotalEarlySenders: 0 (0) - TotalEosReceived: 1 (1) - TotalHasDeferredRPCsTime: 15s446ms - TotalRPCsDeferred: 38 (38) Sample sender's profile: KrpcDataStreamSender (dst_id=21):(Total: 17s923ms, non-child: 604.494ms, % non-child: 3.37%) BytesSent(500.000ms): 0, 0, 0, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 46.54 KB, 46.54 KB, 46.54 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 974.44 KB, 2.82 MB, 4.93 MB, 6.27 MB, 8.28 MB, 9.69 MB - EosSent: 3 (3) - NetworkThroughput: 4.61 MB/sec - PeakMemoryUsage: 22.57 KB (23112) - RowsSent: 287.51K (287514) - RpcFailure: 0 (0) - RpcRetry: 0 (0) - SerializeBatchTime: 329.162ms - TotalBytesSent: 9.69 MB (10161432) - UncompressedRowBatchSize: 20.56 MB (21563550) Change-Id: I8ba405921b3df920c1e85b940ce9c8d02fc647cd Reviewed-on: http://gerrit.cloudera.org:8080/9690 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/31545 Tested-by: Michael Ho <kwho@cloudera.com>
| Commit: | 37eb97d | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Laszlo Gaal | |
IMPALA-6685: Improve profiles in KrpcDataStreamRecvr and KrpcDataStreamSender This change implements a couple of improvements to the profiles of KrpcDataStreamRecvr and KrpcDataStreamSender: - track pending number of deferred row batches over time in KrpcDataStreamRecvr - track the number of bytes dequeued over time in KrpcDataStreamRecvr - track the total time deferred RPCs queues are not empty - track the number of bytes sent from KrpcDataStreamSender over time - track the total amount of time spent in KrpcDataStreamSender, including time spent waiting for RPC completion. Sample profile of an Exchange node instance: EXCHANGE_NODE (id=21):(Total: 2s284ms, non-child: 64.926ms, % non-child: 2.84%) - ConvertRowBatchTime: 44.380ms - PeakMemoryUsage: 124.04 KB (127021) - RowsReturned: 287.51K (287514) - RowsReturnedRate: 125.88 K/sec Buffer pool: - AllocTime: 1.109ms - CumulativeAllocationBytes: 10.96 MB (11493376) - CumulativeAllocations: 562 (562) - PeakReservation: 112.00 KB (114688) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 112.00 KB (114688) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns Dequeue: BytesDequeued(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 700.00 KB, 2.00 MB, 3.49 MB, 4.39 MB, 5.86 MB, 6.85 MB - FirstBatchWaitTime: 0.000ns - TotalBytesDequeued: 6.85 MB (7187850) - TotalGetBatchTime: 2s237ms - DataWaitTime: 2s219ms Enqueue: BytesReceived(500.000ms): 0, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 328.73 KB, 963.79 KB, 1.64 MB, 2.09 MB, 2.76 MB, 3.23 MB DeferredQueueSize(500.000ms): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0 - DispatchTime: (Avg: 108.593us ; Min: 30.525us ; Max: 1.524ms ; Number of samples: 281) - DeserializeRowBatchTime: 8.395ms - TotalBatchesEnqueued: 281 (281) - TotalBatchesReceived: 281 (281) - TotalBytesReceived: 3.23 MB (3387144) - TotalEarlySenders: 0 (0) - TotalEosReceived: 1 (1) - TotalHasDeferredRPCsTime: 15s446ms - TotalRPCsDeferred: 38 (38) Sample sender's profile: KrpcDataStreamSender (dst_id=21):(Total: 17s923ms, non-child: 604.494ms, % non-child: 3.37%) BytesSent(500.000ms): 0, 0, 0, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 46.54 KB, 46.54 KB, 46.54 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 974.44 KB, 2.82 MB, 4.93 MB, 6.27 MB, 8.28 MB, 9.69 MB - EosSent: 3 (3) - NetworkThroughput: 4.61 MB/sec - PeakMemoryUsage: 22.57 KB (23112) - RowsSent: 287.51K (287514) - RpcFailure: 0 (0) - RpcRetry: 0 (0) - SerializeBatchTime: 329.162ms - TotalBytesSent: 9.69 MB (10161432) - UncompressedRowBatchSize: 20.56 MB (21563550) Change-Id: I8ba405921b3df920c1e85b940ce9c8d02fc647cd Reviewed-on: http://gerrit.cloudera.org:8080/9690 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/31545 Tested-by: Michael Ho <kwho@cloudera.com>
| Commit: | 344b39a | |
|---|---|---|
| Author: | Philip Zeyliger | |
| Committer: | Philip Zeyliger | |
Revert "IMPALA-6193: Track memory of incoming data streams" This reverts commit 3bfda3348740e0951cbf8f60cde70cc4d1391c5e.
| Commit: | 4a6518f | |
|---|---|---|
| Author: | Philip Zeyliger | |
| Committer: | Philip Zeyliger | |
Revert "KUDU-2301: (Part-1) Add instrumentation on a per connection level" This reverts commit cf5ef7f70983747103ca2d5052a9a61f7eb4b349.
| Commit: | 0773946 | |
|---|---|---|
| Author: | Philip Zeyliger | |
| Committer: | Philip Zeyliger | |
Revert "IMPALA-6685: Improve profiles in KrpcDataStreamRecvr and KrpcDataStreamSender" This reverts commit 421af4e40a862e6fb9184520ee64b4c86b77f7e2. Conflicts: be/src/runtime/krpc-data-stream-recvr.cc Change-Id: I669485ef0b2252af4fa2d3f7eb1260bae9767c00
| Commit: | ac3bffb | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Michael Ho | |
IMPALA-6685: Improve profiles in KrpcDataStreamRecvr and KrpcDataStreamSender This change implements a couple of improvements to the profiles of KrpcDataStreamRecvr and KrpcDataStreamSender: - track pending number of deferred row batches over time in KrpcDataStreamRecvr - track the number of bytes dequeued over time in KrpcDataStreamRecvr - track the total time deferred RPCs queues are not empty - track the number of bytes sent from KrpcDataStreamSender over time - track the total amount of time spent in KrpcDataStreamSender, including time spent waiting for RPC completion. Sample profile of an Exchange node instance: EXCHANGE_NODE (id=21):(Total: 2s284ms, non-child: 64.926ms, % non-child: 2.84%) - ConvertRowBatchTime: 44.380ms - PeakMemoryUsage: 124.04 KB (127021) - RowsReturned: 287.51K (287514) - RowsReturnedRate: 125.88 K/sec Buffer pool: - AllocTime: 1.109ms - CumulativeAllocationBytes: 10.96 MB (11493376) - CumulativeAllocations: 562 (562) - PeakReservation: 112.00 KB (114688) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 112.00 KB (114688) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns Dequeue: BytesDequeued(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 700.00 KB, 2.00 MB, 3.49 MB, 4.39 MB, 5.86 MB, 6.85 MB - FirstBatchWaitTime: 0.000ns - TotalBytesDequeued: 6.85 MB (7187850) - TotalGetBatchTime: 2s237ms - DataWaitTime: 2s219ms Enqueue: BytesReceived(500.000ms): 0, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 328.73 KB, 963.79 KB, 1.64 MB, 2.09 MB, 2.76 MB, 3.23 MB DeferredQueueSize(500.000ms): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0 - DispatchTime: (Avg: 108.593us ; Min: 30.525us ; Max: 1.524ms ; Number of samples: 281) - DeserializeRowBatchTime: 8.395ms - TotalBatchesEnqueued: 281 (281) - TotalBatchesReceived: 281 (281) - TotalBytesReceived: 3.23 MB (3387144) - TotalEarlySenders: 0 (0) - TotalEosReceived: 1 (1) - TotalHasDeferredRPCsTime: 15s446ms - TotalRPCsDeferred: 38 (38) Sample sender's profile: KrpcDataStreamSender (dst_id=21):(Total: 17s923ms, non-child: 604.494ms, % non-child: 3.37%) BytesSent(500.000ms): 0, 0, 0, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 46.54 KB, 46.54 KB, 46.54 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 974.44 KB, 2.82 MB, 4.93 MB, 6.27 MB, 8.28 MB, 9.69 MB - EosSent: 3 (3) - NetworkThroughput: 4.61 MB/sec - PeakMemoryUsage: 22.57 KB (23112) - RowsSent: 287.51K (287514) - RpcFailure: 0 (0) - RpcRetry: 0 (0) - SerializeBatchTime: 329.162ms - TotalBytesSent: 9.69 MB (10161432) - UncompressedRowBatchSize: 20.56 MB (21563550) Change-Id: I8ba405921b3df920c1e85b940ce9c8d02fc647cd Reviewed-on: http://gerrit.cloudera.org:8080/9690 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/31545 Tested-by: Michael Ho <kwho@cloudera.com>
| Commit: | 421af4e | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Impala Public Jenkins | |
IMPALA-6685: Improve profiles in KrpcDataStreamRecvr and KrpcDataStreamSender This change implements a couple of improvements to the profiles of KrpcDataStreamRecvr and KrpcDataStreamSender: - track pending number of deferred row batches over time in KrpcDataStreamRecvr - track the number of bytes dequeued over time in KrpcDataStreamRecvr - track the total time deferred RPCs queues are not empty - track the number of bytes sent from KrpcDataStreamSender over time - track the total amount of time spent in KrpcDataStreamSender, including time spent waiting for RPC completion. Sample profile of an Exchange node instance: EXCHANGE_NODE (id=21):(Total: 2s284ms, non-child: 64.926ms, % non-child: 2.84%) - ConvertRowBatchTime: 44.380ms - PeakMemoryUsage: 124.04 KB (127021) - RowsReturned: 287.51K (287514) - RowsReturnedRate: 125.88 K/sec Buffer pool: - AllocTime: 1.109ms - CumulativeAllocationBytes: 10.96 MB (11493376) - CumulativeAllocations: 562 (562) - PeakReservation: 112.00 KB (114688) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 112.00 KB (114688) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns Dequeue: BytesDequeued(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 700.00 KB, 2.00 MB, 3.49 MB, 4.39 MB, 5.86 MB, 6.85 MB - FirstBatchWaitTime: 0.000ns - TotalBytesDequeued: 6.85 MB (7187850) - TotalGetBatchTime: 2s237ms - DataWaitTime: 2s219ms Enqueue: BytesReceived(500.000ms): 0, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 23.36 KB, 328.73 KB, 963.79 KB, 1.64 MB, 2.09 MB, 2.76 MB, 3.23 MB DeferredQueueSize(500.000ms): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0 - DispatchTime: (Avg: 108.593us ; Min: 30.525us ; Max: 1.524ms ; Number of samples: 281) - DeserializeRowBatchTime: 8.395ms - TotalBatchesEnqueued: 281 (281) - TotalBatchesReceived: 281 (281) - TotalBytesReceived: 3.23 MB (3387144) - TotalEarlySenders: 0 (0) - TotalEosReceived: 1 (1) - TotalHasDeferredRPCsTime: 15s446ms - TotalRPCsDeferred: 38 (38) Sample sender's profile: KrpcDataStreamSender (dst_id=21):(Total: 17s923ms, non-child: 604.494ms, % non-child: 3.37%) BytesSent(500.000ms): 0, 0, 0, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 34.78 KB, 46.54 KB, 46.54 KB, 46.54 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 58.31 KB, 974.44 KB, 2.82 MB, 4.93 MB, 6.27 MB, 8.28 MB, 9.69 MB - EosSent: 3 (3) - NetworkThroughput: 4.61 MB/sec - PeakMemoryUsage: 22.57 KB (23112) - RowsSent: 287.51K (287514) - RpcFailure: 0 (0) - RpcRetry: 0 (0) - SerializeBatchTime: 329.162ms - TotalBytesSent: 9.69 MB (10161432) - UncompressedRowBatchSize: 20.56 MB (21563550) Change-Id: I8ba405921b3df920c1e85b940ce9c8d02fc647cd Reviewed-on: http://gerrit.cloudera.org:8080/9690 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | bb96230 | |
|---|---|---|
| Author: | Sailesh Mukil | |
| Committer: | Jenkins | |
KUDU-2301: (Part-1) Add instrumentation on a per connection level This patch returns the OutboundTransfer queue size on a per connection level and makes them accessible via the DumpRunningRpcs() call. A test is added in rpc-test to ensure that this metric works as expected. A future patch will add more metrics. Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465 Reviewed-on: http://gerrit.cloudera.org:8080/9343 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <todd@apache.org> Reviewed-on: http://gerrit.cloudera.org:8080/9383 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | cf5ef7f | |
|---|---|---|
| Author: | Sailesh Mukil | |
| Committer: | Impala Public Jenkins | |
KUDU-2301: (Part-1) Add instrumentation on a per connection level This patch returns the OutboundTransfer queue size on a per connection level and makes them accessible via the DumpRunningRpcs() call. A test is added in rpc-test to ensure that this metric works as expected. A future patch will add more metrics. Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465 Reviewed-on: http://gerrit.cloudera.org:8080/9343 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <todd@apache.org> Reviewed-on: http://gerrit.cloudera.org:8080/9383 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | 5960485 | |
|---|---|---|
| Author: | Lars Volker | |
| Committer: | Jenkins | |
IMPALA-6193: Track memory of incoming data streams This change adds memory tracking to incoming transmit data RPCs when using KRPC. We track memory against a global tracker called "Data Stream Service" until it is handed over to the stream manager. There we track it in a global tracker called "Data Stream Queued RPC Calls" until a receiver registers and takes over the early sender RPCs. Inside the receiver, memory for deferred RPCs is tracked against the fragment instance's memtracker until we unpack the batches and add them to the row batch queue. The DCHECK in MemTracker::Close() covers that all memory consumed by a tracker gets release eventually. In addition to that, this change adds a custom cluster test that makes sure that queued memory gets tracked by inspecting the peak consumption of the new memtrackers. Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4 Reviewed-on: http://gerrit.cloudera.org:8080/8914 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | 3bfda33 | |
|---|---|---|
| Author: | Lars Volker | |
| Committer: | Impala Public Jenkins | |
IMPALA-6193: Track memory of incoming data streams This change adds memory tracking to incoming transmit data RPCs when using KRPC. We track memory against a global tracker called "Data Stream Service" until it is handed over to the stream manager. There we track it in a global tracker called "Data Stream Queued RPC Calls" until a receiver registers and takes over the early sender RPCs. Inside the receiver, memory for deferred RPCs is tracked against the fragment instance's memtracker until we unpack the batches and add them to the row batch queue. The DCHECK in MemTracker::Close() covers that all memory consumed by a tracker gets release eventually. In addition to that, this change adds a custom cluster test that makes sure that queued memory gets tracked by inspecting the peak consumption of the new memtrackers. Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4 Reviewed-on: http://gerrit.cloudera.org:8080/8914 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | 88f2bc1 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Jenkins | |
IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents avoids the possibility that a thread is stuck in the RPC code for extended amount of time without checking for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, the number of service threads equals the number of logical cores. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a queue for deferred processing. The stashed RPC requests will not be responded to until they are processed so as to exert back pressure to the senders. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING ------- * Builds {exhaustive/debug, core/release, asan} passed with FLAGS_use_krpc=true. TO DO ----- * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Reviewed-on: http://gerrit.cloudera.org:8080/8023 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | b4ea57a | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Impala Public Jenkins | |
IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents avoids the possibility that a thread is stuck in the RPC code for extended amount of time without checking for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, the number of service threads equals the number of logical cores. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a queue for deferred processing. The stashed RPC requests will not be responded to until they are processed so as to exert back pressure to the senders. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING ------- * Builds {exhaustive/debug, core/release, asan} passed with FLAGS_use_krpc=true. TO DO ----- * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Reviewed-on: http://gerrit.cloudera.org:8080/8023 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | ff0068b | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Jenkins | |
IMPALA-4670: Introduces RpcMgr class This patch introduces a new class, RpcMgr which is the abstraction layer around KRPC core mechanics. It provides an interface RegisterService() for various services to register themselves. Kudu RPC is invoked via an auto-generated interface called proxy. This change implements an inline wrapper for KRPC client to obtain a proxy for a particular service exported by remote server. Last but not least, the RpcMgr will start all registered services if FLAGS_use_krpc is true. This patch hasn't yet added any service except for some test services in rpc-mgr-test. This patch is based on an abandoned patch by Henry Robinson. Testing done: a new backend test is added to exercise the code and demonstrate the way to interact with KRPC framework. Change-Id: I8adb10ae375d7bf945394c38a520f12d29cf7b46 Reviewed-on: http://gerrit.cloudera.org:8080/7901 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | dd4c6be | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Impala Public Jenkins | |
IMPALA-4670: Introduces RpcMgr class This patch introduces a new class, RpcMgr which is the abstraction layer around KRPC core mechanics. It provides an interface RegisterService() for various services to register themselves. Kudu RPC is invoked via an auto-generated interface called proxy. This change implements an inline wrapper for KRPC client to obtain a proxy for a particular service exported by remote server. Last but not least, the RpcMgr will start all registered services if FLAGS_use_krpc is true. This patch hasn't yet added any service except for some test services in rpc-mgr-test. This patch is based on an abandoned patch by Henry Robinson. Testing done: a new backend test is added to exercise the code and demonstrate the way to interact with KRPC framework. Change-Id: I8adb10ae375d7bf945394c38a520f12d29cf7b46 Reviewed-on: http://gerrit.cloudera.org:8080/7901 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | ebd0b24 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | jenkins | |
KUDU-2065: Support cancellation for outbound RPC call This change implements a new interface RpcController::Cancel() which takes a RpcController as argument and cancels any pending OutboundCall associated with it. RpcController::Cancel() queues a cancellation task scheduled on the reactor thread for that outbound call. Once the task is run, it will cancel the outbound call right away if the RPC hasn't started sending yet or if it has already sent the request and waiting for a response. If cancellation happens when the RPC request is being sent, the RPC will be cancelled only after the RPC has finished sending the request. If the RPC is finished, the cancellation will be a no-op. Change-Id: Iaf53c5b113de10d573bd32fb9b2293572e806fbf Reviewed-on: http://gerrit.cloudera.org:8080/7455 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <todd@apache.org> Reviewed-on: http://gerrit.cloudera.org:8080/7743 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | c1c4815 | |
|---|---|---|
| Author: | Michael Ho | |
| Committer: | Impala Public Jenkins | |
KUDU-2065: Support cancellation for outbound RPC call This change implements a new interface RpcController::Cancel() which takes a RpcController as argument and cancels any pending OutboundCall associated with it. RpcController::Cancel() queues a cancellation task scheduled on the reactor thread for that outbound call. Once the task is run, it will cancel the outbound call right away if the RPC hasn't started sending yet or if it has already sent the request and waiting for a response. If cancellation happens when the RPC request is being sent, the RPC will be cancelled only after the RPC has finished sending the request. If the RPC is finished, the cancellation will be a no-op. Change-Id: Iaf53c5b113de10d573bd32fb9b2293572e806fbf Reviewed-on: http://gerrit.cloudera.org:8080/7455 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon <todd@apache.org> Reviewed-on: http://gerrit.cloudera.org:8080/7743 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | 9c03500 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | jenkins | |
IMPALA-4669: [KRPC] Import RPC library from kudu@314c9d8 Change-Id: I06ab5b56312e482a27fa484414c338438ad6972c Reviewed-on: http://gerrit.cloudera.org:8080/5718 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | c7db60a | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Impala Public Jenkins | |
IMPALA-4669: [KRPC] Import RPC library from kudu@314c9d8 Change-Id: I06ab5b56312e482a27fa484414c338438ad6972c Reviewed-on: http://gerrit.cloudera.org:8080/5718 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | b87fbe6 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | jenkins | |
IMPALA-4669: [SECURITY] Import Kudu security library from kudu@314c9d8 The security library provides Kerberos and TLS facilities to the rpc library. Change-Id: I76daeead00f672aa468f5ab6de4d70eac2078cb2 Reviewed-on: http://gerrit.cloudera.org:8080/5716 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>
| Commit: | 84b8155 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
IMPALA-4669: [SECURITY] Import Kudu security library from kudu@314c9d8 The security library provides Kerberos and TLS facilities to the rpc library. Change-Id: I76daeead00f672aa468f5ab6de4d70eac2078cb2 Reviewed-on: http://gerrit.cloudera.org:8080/5716 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>
| Commit: | 31afd80 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | jenkins | |
IMPALA-4669: [KUTIL] Import kudu_util library from kudu@314c9d8 Update LICENSE.txt and rat_exclude_files.txt Change-Id: I6d89384730b60354b5fae2b1472183d2a561d170 Reviewed-on: http://gerrit.cloudera.org:8080/5714 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | d6abb29 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Impala Public Jenkins | |
IMPALA-4669: [KUTIL] Import kudu_util library from kudu@314c9d8 Update LICENSE.txt and rat_exclude_files.txt Change-Id: I6d89384730b60354b5fae2b1472183d2a561d170 Reviewed-on: http://gerrit.cloudera.org:8080/5714 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Impala Public Jenkins
| Commit: | ac1564a | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | jenkins | |
IMPALA-4758: (1/2) Update gutil/ from Kudu@a1bfd7b * Copy gutil from Kudu * Minimal changes to gutil/CMakeLists.txt Change-Id: Ic708a9c4e76ede17af9b06e0a0a8e9ae7d357960 Reviewed-on: http://gerrit.cloudera.org:8080/5687 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>
This commit does not contain any .proto files.
| Commit: | 02f3e3f | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
IMPALA-4758: (1/2) Update gutil/ from Kudu@a1bfd7b * Copy gutil from Kudu * Minimal changes to gutil/CMakeLists.txt Change-Id: Ic708a9c4e76ede17af9b06e0a0a8e9ae7d357960 Reviewed-on: http://gerrit.cloudera.org:8080/5687 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>
This commit does not contain any .proto files.
| Commit: | e8fe220 | |
|---|---|---|
| Author: | David Alves | |
| Committer: | David Alves | |
Merge thirdparty from cdh5-trunk This is the first step towards merging impala-kudu with trunk. These are basically just mechanical changes, pulling from trunk thirparty and just enough other changes to cmake build scripts or impala-config.sh to make it compile. NOTE: This patch is basically half-way between the impala-kudu build, that doesn't yet use the toolchain and the impala trunk build that does. As such this patch doesn't actually build stand-alone and serves merely the purpose of ommitting +/- 650K loc from the merge patch itself. Change-Id: Ic794988dcadee16e687a82745b417605772ff325
| Commit: | 113f88d | |
|---|---|---|
| Author: | David Alves | |
| Committer: | David Alves | |
Merge thirdparty from cdh5-trunk This is the first step towards merging impala-kudu with trunk. These are basically just mechanical changes, pulling from trunk thirparty and just enough other changes to cmake build scripts or impala-config.sh to make it compile. NOTE: This patch is basically half-way between the impala-kudu build, that doesn't yet use the toolchain and the impala trunk build that does. As such this patch doesn't actually build stand-alone and serves merely the purpose of ommitting +/- 650K loc from the merge patch itself. Change-Id: Ic794988dcadee16e687a82745b417605772ff325
| Commit: | 706b757 | |
|---|---|---|
| Author: | Martin Grund | |
| Committer: | Martin Grund | |
Optional Impala Toolchain This patch allows to optionally enable the new Impala binary toolchain. For now there are now major version differences in the toolchain dependencies and what is currently kept in thirdparty. To enable the toolchain, export the variable IMPALA_TOOLCHAIN to the folder where the binaries are available. In addition this patch moves gutil from the thirdparty directory into the source tree of be/src to allow easy propagation of compiler and linker flags. Furthermore, the thrift-cpp target was added as a dependency to all targets that require the generated thrift sources to be available before the build is started. What is the new toolchain: The goal of the toolchain is to homogenize the build environment and to make sure that Impala is build nearly identical on every platform. To achieve this, we limit the flexibility of using the systems host libraries and rather rely on a set of custom produced binaries including the necessary compiler. Change-Id: If2dac920520e4a18be2a9a75b3184a5bd97a065b Reviewed-on: http://gerrit.cloudera.org:8080/427 Reviewed-by: Adar Dembo <adar@cloudera.com> Tested-by: Internal Jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>
| Commit: | 81f247b | |
|---|---|---|
| Author: | Martin Grund | |
| Committer: | Martin Grund | |
Optional Impala Toolchain This patch allows to optionally enable the new Impala binary toolchain. For now there are now major version differences in the toolchain dependencies and what is currently kept in thirdparty. To enable the toolchain, export the variable IMPALA_TOOLCHAIN to the folder where the binaries are available. In addition this patch moves gutil from the thirdparty directory into the source tree of be/src to allow easy propagation of compiler and linker flags. Furthermore, the thrift-cpp target was added as a dependency to all targets that require the generated thrift sources to be available before the build is started. What is the new toolchain: The goal of the toolchain is to homogenize the build environment and to make sure that Impala is build nearly identical on every platform. To achieve this, we limit the flexibility of using the systems host libraries and rather rely on a set of custom produced binaries including the necessary compiler. Change-Id: If2dac920520e4a18be2a9a75b3184a5bd97a065b Reviewed-on: http://gerrit.cloudera.org:8080/427 Reviewed-by: Adar Dembo <adar@cloudera.com> Tested-by: Internal Jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>
| Commit: | f1fff5d | |
|---|---|---|
| Author: | David Alves | |
| Committer: | Martin Grund | |
Update/Trim thirdparty/gutil Working on the integration of Kudu and Impala we're starting to hit problems due to each project keeping slighlty different versions of gutil. Simply copy/pasting Kudu's gutil is problematic as it refers to Kudu often (both in comments and namespace). Ideally we'd like to pull this to a common place such as the toolchain Intil we do so, however, the least problematic approach is to trim the the dependencies to the ones that Impala actually uses and update the ones that are not trimmed to the latest version, from Kudu, which is that this patch does. Change-Id: I935c50344622ff12cde78ef809e7961378ec7774 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/6343 Tested-by: jenkins Reviewed-by: David Alves <david.alves@cloudera.com>
This commit does not contain any .proto files.
| Commit: | 69dc9b1 | |
|---|---|---|
| Author: | Srinath Shankar | |
| Committer: | Srinath Shankar | |
[CDH5] Remove cdh5.0.0 from thirdparty Change-Id: Ie55e301fe791a7baba512ac0fea291267ff3017e
| Commit: | 2e001db | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Lenni Kuff | |
[CDH5] Remove old /thirdparty Hadoop/Hbase/Hive dependencies Change-Id: I1fd3609fa533991da712204bbb422228e6b5d46f
| Commit: | 5ec24e6 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Lenni Kuff | |
[CDH5] Remove -SNAPSHOT suffix from /thirdparty dependencies Change-Id: Id8d69aebf404ac7471fb3456e85c2c21ddd1bd55
| Commit: | 6f06b8f | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Adding hive-0.10.0-cdh4.5.0 w/ PATCH-219 to /thirdparty PATCH-219 HIVE for DISTRO-557 Hive revision: 208131090a7888bd7038404e5a3003a906c16b36 Fixes 1s sleep before opening each metastore connection. Change-Id: Ic0cb7c4a7349b63d6cb97a2434a47f7ab11ddd38 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1311 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>
| Commit: | 29e65a5 | |
|---|---|---|
| Author: | Matthew Jacobs | |
| Committer: | ishaan | |
[cdh5] Add latest cdh5 hadoop, hbase, and hive snapshots to thirdparty Change-Id: I60c93b259a26e86aca60f2b3b5b6226eabc0b5eb
| Commit: | 16fabb9 | |
|---|---|---|
| Author: | Alex Behm | |
Remove unused CDH4.5 dependencies from thirdparty. Change-Id: Ibe3745c50f748088d833e6d906a0d4f7f116987a
| Commit: | 0bcfe7e | |
|---|---|---|
| Author: | Nong Li | |
| Committer: | jenkins | |
[CDH4] Remove docs and srcs from hbase thirdparty. Change-Id: I7da94152f625937a3abb3ac8cdb0c8372e520081 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1339 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins
| Commit: | 4925827 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
Add gutil to thirdparty Change-Id: Ic7bf4c0faef6b11ef3a0cfff68af845127ff21e9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1025 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>
| Commit: | d2ccfba | |
|---|---|---|
| Author: | ishaan | |
| Committer: | Henry Robinson | |
Upgrade thirdparty to use CDH4.5 bits. The following changes have been made: -- Update hbase -- Update hive -- Update hadoop -- Update the parquet version to 1.2.5 Change-Id: Id6ceaef0e9eebab27ffd408160116fa84ed300fb
| Commit: | 35247fb | |
|---|---|---|
| Author: | ishaan | |
| Committer: | Henry Robinson | |
Remove CDH4.3 thirdparty libraries. Change-Id: Ic9ce07d46054bcf03ea98eb90b2972ed768aed3e
This commit does not contain any .proto files.
| Commit: | 4629aa3 | |
|---|---|---|
| Author: | Skye Wanderman-Milne | |
| Committer: | Henry Robinson | |
Upgrade Avro library to 1.7.4
| Commit: | 71278c9 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Update Hive to CDH4.3.0
| Commit: | 3a596b9 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Remove CDH4.2.0 Hive
| Commit: | dadb634 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Add CDH 4.3.0 HBase to thirdparty
| Commit: | 55154d4 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Remove CDH 4.2.0 HBase from thirdparty
| Commit: | 16482a5 | |
|---|---|---|
| Author: | Skye Wanderman-Milne | |
| Committer: | Henry Robinson | |
thirdparty/avro
| Commit: | 2639483 | |
|---|---|---|
| Author: | Nong Li | |
| Committer: | Henry Robinson | |
Moved HBase to CDH4.2
| Commit: | de224a4 | |
|---|---|---|
| Author: | Nong Li | |
| Committer: | Henry Robinson | |
Remove hive 0.9.
| Commit: | 43be278 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Henry Robinson | |
Add hive-0.10.0-cdh4.2.0 snapshot (for Hive JDBC driver) This change adds the hive-0.10.0-cdh4.2.0 snapshot. It is currently only used JDBC testing but eventually we will move from hive-0.9.0 to hive-0.10.0. This snapshot may need to be refreshed at that time. From: CDH/hive - b57b6b7375edcbf54713c4900a3d48e1a76bd00a
| Commit: | fb46888 | |
|---|---|---|
| Author: | Nong Li | |
| Committer: | Henry Robinson | |
Updated hive and hbase to rc3.
| Commit: | 08b65c1 | |
|---|---|---|
| Author: | ishaan | |
| Committer: | Lenni Kuff | |
Upgrade thirdparty to use CDH4.5 bits. The following changes have been made: -- Update hbase -- Update hive -- Update hadoop -- Update the parquet version to 1.2.5 Change-Id: Id6ceaef0e9eebab27ffd408160116fa84ed300fb
| Commit: | c89b054 | |
|---|---|---|
| Author: | ishaan | |
| Committer: | Lenni Kuff | |
Remove CDH4.3 thirdparty libraries. Change-Id: Ic9ce07d46054bcf03ea98eb90b2972ed768aed3e
This commit does not contain any .proto files.
| Commit: | 084fc74 | |
|---|---|---|
| Author: | Skye Wanderman-Milne | |
| Committer: | Lenni Kuff | |
Upgrade Avro library to 1.7.4
| Commit: | 30b28a6 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Lenni Kuff | |
Update Hive to CDH4.3.0
| Commit: | f0130c6 | |
|---|---|---|
| Author: | Lenni Kuff | |
| Committer: | Lenni Kuff | |
Add CDH 4.3.0 HBase to thirdparty
| Commit: | 89d855d | |
|---|---|---|
| Author: | Skye Wanderman-Milne | |
| Committer: | Lenni Kuff | |
thirdparty/avro
| Commit: | 34d45fd | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
Upgrade to post-CDH4u0: Hadoop 2.0.0, Hive 0.8.1 and HBase 0.92.1
| Commit: | 1671993 | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
Upgrade to hbase-0.92.0-cdh4b1
| Commit: | 57a6a0a | |
|---|---|---|
| Author: | Henry Robinson | |
| Committer: | Henry Robinson | |
Upgrade Hive to hive-0.8.0-cdh4b1
| Commit: | 1f779e8 | |
|---|---|---|
| Author: | Alexander Behm | |
| Committer: | Alexander Behm | |
Added HBase test files, and loading of HBase metadata into Impala.