Proto commits in apache/drill

These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)

Commit:02ce64e
Author:James Turton
Committer:GitHub

DRILL-8314: Add support for automatically retrying and disabling broken storage plugins (#2655)

The documentation is generated from this commit.

Commit:284dafe
Author:James Turton
Committer:GitHub

DRILL-8322: Add a list of scanned plugin names to the query profile (#2661)

Commit:ab7f9e9
Author:James Turton
Committer:GitHub

DRILL-8136: Overhaul implict type casting logic (#2638)

Commit:3d1bc2c
Author:Volodymyr Vysotskyi
Committer:Charles S. Givre

DRILL-5405: Add missing operator types without dependency on protobuf enum

Commit:8f892b3
Author:Charles Givre
Committer:Charles S. Givre

DRILL-7823 - Add XML Format Plugin

Commit:3d6b67c
Author:akkapur
Committer:Charles S. Givre

DRILL-5956: Add Storage Plugin for Apache Druid

Commit:5a067da
Author:Charles Givre
Committer:Charles S. Givre

DRILL-7716: Create Format Plugin for SPSS Files

Commit:4d47a61
Author:Charles Givre
Committer:Paul Rogers

DRILL-7437: Storage Plugin for Generic HTTP REST API

Commit:7d5f5f6
Author:weijie.tong
Committer:Volodymyr Vysotskyi

DRILL-7607: support dynamic credit based flow control closes #2000

Commit:ab60b3d
Author:Volodymyr Vysotskyi
Committer:Arina Ielchiieva

DRILL-7592: Add missing licenses and update plugins exclusion list and fix licenses closes #1989

Commit:7453166
Author:Charles Givre
Committer:Arina Ielchiieva

DRILL-7233: Format Plugin for HDF5 closes #1778

Commit:7ab4c37
Author:Volodymyr Vysotskyi
Committer:Volodymyr Vysotskyi

DRILL-7273: Introduce operators for handling metadata closes #1886

Commit:8f40dc9
Author:Charles Givre
Committer:Volodymyr Vysotskyi

DRILL-4303: ESRI Shapefile (shp) Format Plugin

Commit:f2654ee
Author:Charles Givre
Committer:Arina Ielchiieva

DRILL-7177: Format Plugin for Excel Files closes #1749

Commit:b30830a
Author:Bohdan Kazydub
Committer:Bohdan Kazydub

DRILL-7096: Develop vector for canonical Map<K,V> - Added new type DICT; - Created value vectors for the type for single and repeated modes; - Implemented corresponding FieldReaders and FieldWriters; - Made changes in EvaluationVisitor to be able to read values from the map by key; - Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type; - Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files; - Updated AvroRecordReader to use new DICT type for Avro's MAP; - Added support of the new type to ParquetRecordWriter.

Commit:ac11a6b
Author:Kunal Khatua
Committer:Sorabh Hamirwasia

DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option This introduces support for JDBC's Statement.setMaxRows(int) API, which can help Drill execute a query much faster if it knows that not ALL the records in the resultset will be consumed upfront. This Commit introduces the core changes to support the feature within Drill's execution engine Protobuf Changes 1. RunQuery: Added "autolimit_rowcount" 2. QueryProfile: Added "autoLimit" 3. Regenerated Java and C++ client files REST API support 1. Support for REST server to interpret a submitted query and also for rendering this information for an executed query 2. Updates to the Freemarker templates (for WebUI) 3. Safety check within Javascript (for WebUI) JDBC API support 1. Introduces backend execution of 'ALTER SESSION' to apply the auto-limiting of resultset size 2. Added Unit Tests for PreparedStatement and Statement objects 3. Added getter setter methods to be skipped in testing for org.apache.drill.jdbc.test.Drill2489CallsAfterCloseThrowExceptionsTest.testclosedPreparedStmtOfOpenConnMethodsThrowRight() Updates based on review comments Additional Updates Test Cleanup 1. Revert Drill2489 hack 2. Formatting in *StatementTest 3. Removal f redundant `statement.close()` 4. Manage new Exception thrown when setting invalid maxRow values Final updates 1. Test changes 2. Trim trailing spaces in auto-limit value (Javascript) 3. Before & After annotations to synchronize changes to system values for MaxRows(auto-limit) Reorganized tests due to synchronized locking Removed conflicting JsonCreator in QueryWrapper Additional test cleanup closes #1714

Commit:2364b02
Author:shimamoto
Committer:Sorabh Hamirwasia

DRILL-7014: Format plugin for LTSV files closes #1627

Commit:469be17
Author:Gautam Parai
Committer:Gautam Parai

DRILL-1328: Support table statistics - Part 2 Add support for avg row-width and major type statistics. Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance. Update/fix rowcount, selectivity and ndv computations to improve plan costing. Add options for configuring collection/usage of statistics. Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs). Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries. Add support for CPU sampling and nested scalar columns. Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures. Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests. FUNCS: Statistics functions as UDFs: Separate Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A. * custom versions of "count" that always return BigInt * HyperLogLog based NDV that returns BigInt that works only on VarChars * HyperLogLog with binary output that only works on VarChars OPS: Updated protobufs for new ops OPS: Implemented StatisticsMerge OPS: Implemented StatisticsUnpivot ANALYZE: AnalyzeTable functionality * JavaCC syntax more-or-less copied from LucidDB. * (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel ANALYZE: Add getMetadataTable() to AbstractSchema USAGE: Change field access in QueryWrapper USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel * since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor * This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans. USAGE: Attach DrillStatsTable to DrillTable. * DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table * In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used. ** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated. ** Query is set up to extract only the most recent statistics results for each column. closes #729

Commit:3233d8a
Author:Cliff Buchanan
Committer:Gautam Parai

DRILL-1328: Support table statistics

Commit:a43839e
Author:Charles S. Givre
Committer:Gautam Parai

DRILL-6582: SYSLOG (RFC-5424) Format Plugin closes #1530

Commit:814e9f0
Author:Arina Ielchiieva
Committer:Hanumath Maduri

DRILL-6946: Implement java.sql.Connection setSchema and getSchema methods in DrillConnectionImpl closes #1596

Commit:9667e92
Author:weijie.tong
Committer:Vitalii Diravka

DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf reference count bugs & tune the execution flow & support left deep tree closes #1504

Commit:cd4d68b
Author:Bohdan Kazydub
Committer:Volodymyr Vysotskyi

DRILL-6834: Introduce option to disable result set for DDL queries for JDBC connection - Added session-scoped option `drill.exec.fetch_resultset_for_ddl` to control whether update count or result set should be returned for JDBC connection session. By default the option is set to `true` which ensures that result set is returned; - Updated Drill JDBC: `DrillCursor` and `DrillStatement` to achieve desired behaviour. closes #1549

Commit:0abcbe3
Author:rebase
Committer:Aman Sinha

DRILL-6381: (Part 1) Secondary Index framework   1. Secondary Index planning interfaces and abstract classes like DBGroupScan, DbSubScan, IndexDecriptor etc.   2. Statistics and Cost model interfaces/classes: PluginCost, Statistics, StatisticsPayload, AbstractIndexStatistics   3. ScanBatch and RecordReader to support repeatable scan   4. Secondary Index execution related interfaces: RangePartitionSender, RowKeyJoin, PartitionFunction 5. MD-3979: Query using cast index plan fails with NPE Co-authored-by: Aman Sinha <asinha@maprtech.com> Co-authored-by: chunhui-shi <cshi@maprtech.com> Co-authored-by: Gautam Parai <gparai@maprtech.com> Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com> Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com> Conflicts: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java protocol/src/main/java/org/apache/drill/exec/proto/beans/CoreOperatorType.java protocol/src/main/protobuf/UserBitShared.proto

Commit:216b123
Author:weijie.tong
Committer:Sorabh Hamirwasia

DRILL-6731: Move the BFs aggregating work from the Foreman to the RuntimeFilter

Commit:b895b28
Author:weijie.tong
Committer:Arina Ielchiieva

DRILL-6385: Support JPPD feature

Commit:2dfd0da
Author:Vlad Storona
Committer:Arina Ielchiieva

DRILL-6179: Added pcapng-format support

Commit:f7ae370
Author:Sorabh Hamirwasia
Committer:Sorabh Hamirwasia

DRILL-6635: PartitionLimit for Lateral/Unnest Protobuf changes to add new operator PartitionLimit

Commit:20ecab0
Author:Vitalii Diravka
Committer:Timothy Farkas

DRILL-6639: Exception happens while displaying operator profiles for some queries closes #1404

Commit:e76e389
Author:Vitalii Diravka
Committer:Ben-Zvi

DRILL-6639: Exception happens while displaying operator profiles for some queries

Commit:2162986
Author:Vitalii Diravka
Committer:Vitalii Diravka

DRILL-6627: Adding REGEX_SUB_SCAN operator to protobuf file - The operator is added to Java based UserBitShared.proto and C++ based UserBitShared.pb.h - Java and C++ protobuf files are regenerated

Commit:4baf769
Author:Kunal Khatua
Committer:Timothy Farkas

DRILL-6455: Add missing JDBC Scan Operator for profiles The operator is missing in the profile protobuf. This commit introduces that. 1. Added protobuf files (incl generated C++ and Java) 2. Updated JdbcSubScan's getOperatorType API closes #1297

Commit:1820d46
Author:Kunal Khatua
Committer:Sorabh Hamirwasia

DRILL-6459: Unable to view profile of a running query Fixes the missing text component of the QueryId that causes lookups to fail in `WorkManager.queries` map. This got introduced with the fix (#1265) for DRILL-5305 Reverting change to QueryIdHelper and DRILL-5305 Removing the changes done, based on inputs from @vrozov and @sohami . The correct approach would be to have this as part of the profile to avoid serialization of the queryIdText for each RPC making use of the QueryId UX Changes Set the query ID string and display in WebUI closes #1301

Commit:0029097
Author:Kunal Khatua
Committer:Arina Ielchiieva

DRILL-5305: Query Profile must display Query ID Introduced change to the Protobuf to inject the text-equivalent of the QueryID into the profile. This way, the profile's file name can be changed, but restored back based on this new field. The Profile UI also shows the Query ID, though this is not inferred from this new field, for sake of backward compatibility with older profiles. closes #1265

Commit:79e27ea
Author:Dave Oshinsky
Committer:Volodymyr Vysotskyi

DRILL-4184: Support variable length decimal fields in parquet

Commit:c6549e5
Author:Arina Ielchiieva
Committer:Arina Ielchiieva

DRILL-6331: Revisit Hive Drill native parquet implementation to be exposed to Drill optimizations (filter / limit push down, count to direct scan) 1. Factored out common logic for Drill parquet reader and Hive Drill native parquet readers: AbstractParquetGroupScan, AbstractParquetRowGroupScan, AbstractParquetScanBatchCreator. 2. Rules that worked previously only with ParquetGroupScan, now can be applied for any class that extends AbstractParquetGroupScan: DrillFilterItemStarReWriterRule, ParquetPruneScanRule, PruneScanRule. 3. Hive populated partition values based on information returned from Hive metastore. Drill populates partition values based on path difference between selection root and actual file path. Before ColumnExplorer populated partition values based on Drill approach. Since now ColumnExplorer populates values for parquet files from Hive tables, `populateImplicitColumns` method logic was changed to populated partition columns only based on given partition values. 4. Refactored ParquetPartitionDescriptor to be responsible for populating partition values rather than storing this logic in parquet group scan class. 5. Metadata class was moved to separate metadata package (org.apache.drill.exec.store.parquet.metadata). Factored out several inner classed to improve code readability. 6. Collected all Drill native parquet reader unit tests into one class TestHiveDrillNativeParquetReader, also added new tests to cover new functionality. 7. Reduced excessive logging when parquet files metadata is read closes #1214

Commit:ae1f838
Author:Kunal Khatua
Committer:Parth Chandra

DRILL-6289: Cluster view should show more relevant information Protobuf change to carry HTTP port info Allow CORS for access to remote Drillbit metrics Cross-origin resource sharing (CORS) is required to ensure that the WebServer is able serve REST calls for status pages. Materialize relevant metrics 1. Heap memory (incl usage) 2. Heap memory (incl usage) 3. Average System Load (last 1 min) 4. Option to view from other nodes (pop out) 5. Added Glyphicons Update System Table and related tests 1. Updated System Table to show HTTP port 2. Updated unit tests Skip updating remote bit info when HTTPS (SSL) or Authentication is enabled. Default CpuGaugeSet is public; Added Gauges * CPU Utiization by Drill * Uptime Show ALL Buttons, but do HTTPS Check Reduce power button to icon Allowing CORS for /status/metrics only Accounting for situations when JVM does not report Process CPU Load i.e. returned value is negative. See https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getProcessCpuLoad() Addressed shutdown security conditions Added C++ Client Protobuf Added steps for Protobuf generation to protocol/readme.txt This closes #1203

Commit:1bb2920
Author:Sorabh Hamirwasia
Committer:Parth Chandra

DRILL-6322: Lateral Join: Common changes - Add new iterOutcome, Operatortypes, MockRecordBatch for testing Added new Iterator State EMIT, added operatos LATERA_JOIN & UNNEST in CoreOperatorType and added LateralContract interface Implementation of MockRecordBatch to test operator behavior for different IterOutcomes. a) Creates new output container for schema change cases. b) Doesn't create new container for each next() call without schema change, since the operator in test expects the ValueVector object in it's incoming batch to be same unless a OK_NEW_SCHEMA case is hit. Since setup() method of operator in test will store the reference to value vector received in first batch This closes #1211

Commit:77f5e90
Author:Padma Penumarthy
Committer:Arina Ielchiieva

DRILL-6284: Add operator metrics for batch sizing for flatten

Commit:58e4cec
Author:Arina Ielchiieva
Committer:Vitalii Diravka

DRILL-6130: Fix NPE during physical plan submission for various storage plugins 1. Fixed ser / de issues for Hive, Kafka, Hbase plugins. 2. Added physical plan submission unit test for all storage plugins in contrib module. 3. Refactoring. closes #1108

Commit:e791ed6
Author:Paul Rogers
Committer:Aman Sinha

DRILL-6049: Misc. hygiene and code cleanup changes close apache/drill#1085

Commit:0343518
Author:Arina Ielchiieva
Committer:Arina Ielchiieva

DRILL-5963: Query state process improvements 1. Added two new query states: PREPARING (when foreman is initialized) and PLANNING (includes logical and / or physical planning). 2. Ability to cancel query during planning and enqueued states was added. 3. Logic for submitting fragments was moved from Foreman to new class FragmentsRunner. 4. Logic for moving query from to new state and incrementing / decrementing query counters was moved into QueryStateProcessor class. 5. Major type in DrillFuncHolderExpr was cached for better performance. closes #1051

Commit:5f044f2
Author:dvjyothsna
Committer:Arina Ielchiieva

DRILL-4286: Graceful shutdown of drillbit closes #921

Commit:d3f8da2
Author:Anil Kumar Batchu
Committer:Arina Ielchiieva

DRILL-4779: Kafka storage plugin (Kamesh Bhallamudi & Anil Kumar Batchu) closes #1052

Commit:bbc4224
Author:Paul Rogers
Committer:Paul Rogers

DRILL-5716: Queue-driven memory allocation * Creates new core resource management and query queue abstractions. * Adds queue information to the Protobuf layer. * Foreman and Planner changes - Abstracts memory management out to the new resource management layer. This means deferring generating the physical plan JSON to later in the process after memory planning. * Web UI changes * Adds queue information to the main page and the profile page to each query. * Also sorts the list of options displayed in the Web UI. - Added memory reserve A new config parameter, exec.queue.memory_reserve_ratio, sets aside a slice of total memory for operators that do not participate in the memory assignment process. The default is 20% testing will tell us if that value should be larger or smaller. * Additional minor fixes - Code cleanup. - Added mechanism to abandon lease release during shutdown. - Log queue configuration only when the config changes, rather than on every query. - Apply Boaz’ option to enforce a minimum memory allocation per operator. - Additional logging to help testers see what is happening. closes #928

Commit:7873988
Author:Paul Rogers
Committer:Jinfeng Ni

DRILL-5512: Standardize error handling in ScanBatch Standardizes error handling to throw a UserException. Prior code threw various exceptions, called the fail() method, or returned a variety of status codes. closes #838

Commit:ce8bbc0
Author:Sorabh Hamirwasia
Committer:Aman Sinha

DRILL-4335: Apache Drill should support network encryption. NOTE: This pull request provides support for on-wire encryption using SASL framework. The communication channel that are covered are: 1) Between Drill JDBC client and Drillbit. 2) Between Drillbit to Drillbit i.e. control/data channels. 3) It has UI change to view encryption is enabled on which network channel and number of encrypted/unencrypted connections for user/control/data connections. close apache/drill#773

Commit:6741e68
Author:Arina Ielchiieva
Committer:Parth Chandra

DRILL-5419: Calculate return string length for literals & some string functions 1. Revisited calculation logic for string literals and some string functions (cast, upper, lower, initcap, reverse, concat, concat operator, rpad, lpad, case statement, coalesce, first_value, last_value, lag, lead). Synchronized return type length calculation logic between limit 0 and regular queries. 2. Deprecated width and changed it to precision for string types in MajorType. 3. Revisited FunctionScope and splitted it into FunctionScope and ReturnType. FunctionScope will indicate only function usage in term of number of in / out rows, (n -> 1, 1 -> 1, 1->n). New annotation in UDFs ReturnType will indicate which return type strategy should be used. 4. Changed MAX_VARCHAR_LENGTH from 65536 to 65535. 5. Updated calculation of precision and display size for INTERVALYEAR & INTERVALDAY. 6. Refactored part of function code-gen logic (ValueReference, WorkspaceReference, FunctionAttributes, DrillFuncHolder). This closes #819

Commit:d2e0f41
Author:Laurent Goujon
Committer:Jinfeng Ni

DRILL-5301: Server metadata API Add a Server metadata API to the User protocol, to query server support of various SQL features. Add support to the client (DrillClient) to query this information. Add support to the JDBC driver to query this information, if the server supports the new API, or fallback to the previous behaviour (rely on Avatica defaults) otherwise. close #764

Commit:16aa081
Author:Laurent Goujon
Committer:Jinfeng Ni

DRILL-4994: Add back JDBC prepared statement for older servers When the JDBC client is connected to an older Drill server, it always attempted to use server-side prepared statement with no fallback. With this change, client will check server version and will fallback to the previous client-side prepared statement (which is still limited to only execute queries and does not provide metadata). close #613

Commit:e17baa8
Author:Sudheesh Katkam
Committer:Sudheesh Katkam

DRILL-4280: CORE (Java protocol) + Define SaslStatus and SaslMessage messages in protocol + Add "authenticationMechanisms" field to all handshakes + Add "saslSupport” field to UserToBitHandshake

Commit:6892164
Author:Kunal Khatua
Committer:Sudheesh Katkam

DRILL-5190: Display planning and queued time for a query's profile page Modified UserSharedBit protobuf for marking planning and wait-in-queue end times. This will allow for accurately reporting the planning, queued and actual execution times of a query. Planning Time: In the absence of the planning time's end, for older profiles, the root fragment's (i.e. SCREEN operator) start time is taken as the estimated end of planning time, and as the estimated start time of the execution phase. QueueWait Time: We do not estimate the queue time if the planning end time is not available. Execution Time: We calculate the execution time based on the availability of these 2 planning time. The computation is done the following way, and reflects a decreasing level of accuracy 1. Execution time = [end(QueueWait) - endTime(Query)] 2. Execution time = [end(Planning) - endTime(Query)] 3. Execution time = [start(rootFragment) - endTime(Query)] - {Estimated} closes #738

Commit:31b5282
Author:Nagarajan Chinnasamy
Committer:Parth Chandra

DRILL-5043: Function that returns a unique id per session/connection similar to MySQL's CONNECTION_ID() #685

Commit:63ffeff
Author:Arina Ielchiieva
Committer:Aman Sinha

DRILL-4604: Generate warning on Web UI if drillbits version mismatch is detected close apache/drill#482

Commit:6782f0a
Author:Arina Ielchiieva
Committer:Sudheesh Katkam

DRILL-4792: Include session options used for a query as part of the profile closes #551

Commit:166c4ce
Author:Laurent Goujon
Committer:Parth Chandra

DRILL-4420: C++ API for metadata access and prepared statements Add support to the C++ client for metadata querying and prepared statement requests. Part of the metadata API, add methods to query for server capabilities. As of now, this interface is not backed up by any RPC exchange so the information is pretty much static, and match Drill 1.8.0 current capabilities.

Commit:c6dbe6a
Author:Laurent Goujon
Committer:adeneche

DRILL-4968: Add column size to ColumnMetadata Add a column size to ColumnMetadata so that JDBC and ODBC clients share the implementation and don't have to recompute it client side. this closes #631

Commit:13f21e1
Author:Laurent Goujon
Committer:adeneche

DRILL-4369: Exchange name and version infos during handshake There's no name and version exchanged between client and server over the User RPC channel. On client side, having access to the server name and version is useful to expose it to the user (through JDBC or ODBC api like DatabaseMetadata#getDatabaseProductVersion()), or to implement fallback strategy when some recent API are not available (like metadata API). On the server side, having access to the client version might be useful for audit purposes and eventually to implement fallback strategy if it doesn't require a RPC version change. this closes #622

Commit:89f2633
Author:Arina Ielchiieva
Committer:Parth Chandra

DRILL-4726: Dynamic UDF Support 1) Configuration / parsing / options / protos 2) Zookeeper integration 3) Registration / unregistration / lazy-init 4) Unit tests This closes #574

Commit:d0464ab
Author:Laurent Goujon
Committer:vkorukanti

DRILL-4925: Add tableType filter to GetTables metadata query - Adding tableType filter to GetTablesReq query (needed for JDBC and ODBC drivers). - Fix table type returned by sys and INFORMATION_SCHEMA tables - Also fixes some protobuf typos to related classes. this closes #612 Change-Id: If95246a312f6c6d64a88872936f516308874c2d2

Commit:14f6ec7
Author:vkorukanti
Committer:vkorukanti

DRILL-4729: Add support for prepared statement implementation on server side + Add following APIs for Drill Java client - DrillRpcFuture<CreatePreparedStatementResp> createPreparedStatement(final String query) - void executePreparedStatement(final PreparedStatement preparedStatement, UserResultsListener resultsListener) - List<QueryDataBatch> executePreparedStatement(final PreparedStatement preparedStatement) (for testing purpose) + Separated out the interface from UserClientConnection. It makes it easy to have wrappers which need to tap the messages and data going to the actual client. + Implement CREATE_PREPARED_STATEMENT and handle RunQuery with PreparedStatement + Test changes to support prepared statement as query type + Add tests in TestPreparedStatementProvider this closes #530

Commit:ef6e522
Author:vkorukanti
Committer:vkorukanti

DRILL-4728: Add support for new metadata fetch APIs + Protobuf messages - GetCatalogsReq -> GetCatalogsResp - GetSchemasReq -> GetSchemasResp - GetTablesReq -> GetTablesResp - GetColumnsReq -> GetColumnsResp + Java Drill client changes + Server side changes to handle the metadata API calls - Provide a self contained `Runnable` implementation for each metadata API that process the requests and sends the response to client - In `UserWorker` override the `handle` method that takes the `ResponseSender` and send the response from the `handle` method instead of returning it. - Add a method for each new API to UserWorker to submit the metadata work. - Add a method `addNewWork(Runnable runnable)` to `WorkerBee` to submit a generic `Runnable` to `ExecutorService`. - Move out couple of methods from `QueryContext` into a separate interface `SchemaConfigInfoProvider` to enable instantiating Schema trees without the full `QueryContext` + New protobuf messages increased the `jdbc-all.jar` size. Up the limit to 21MB. this closes #527

Commit:6bba69d
Author:Yuliya Feldman
Committer:Parth Chandra

DRILL-4132 Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution. There are multiple changes to achieve this: 1. During physical planning split single plan into multiple based on the number of minor fragments of the Leaf Major fragment. a. Removing exchange operators during planning b. Producing just root fragments (that will be also leaf fragments) 2. Each fragment can be executed against Drillbit it is assigned to, so to keep locality Design document can be found in the JIRA: DRILL-4132

Commit:67d5cc6
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-4238: Add a custom RPC interface on the Control channel for extensible communication between bits. This closes #313.

Commit:de00881
Author:Hanifi Gunes
Committer:Hanifi Gunes

DRILL-4187: introduce a new query state ENQUEUED and rename the state PENDING to STARTING

Commit:809f462
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-4134: Allocator Improvements - make Allocator mostly lockless - change BaseAllocator maps to direct references - add documentation around memory management model - move transfer and ownership methods to DrillBuf - Improve debug messaging. - Fix/revert sort changes - Remove unused fragment limit flag - Add time to HistoricalLog events - Remove reservation amount from RootAllocator constructor (since not allowed) - Fix concurrency issue where allocator is closing at same moment as incoming batch transfer, causing leaked memory and/or query failure. - Add new AutoCloseables.close(Iterable<AutoCloseable>) - Remove extraneous DataResponseHandler and Impl (and update TestBitRpc to use smarter mock of FragmentManager) - Remove the concept of poison pill record batches, using instead FragmentContext.isOverMemoryLimit() - Update incoming data batches so that they are transferred under protection of a close lock - Improve field names in IncomingBuffers and move synchronization to collectors as opposed to IncomingBuffers (also change decrementing to decrementToZero rather than two part check). This closes #238.

Commit:eeb05fb
Author:Steven Phillips
Committer:Steven Phillips

DRILL-3233: Expression handling for Union types

Commit:eb6325d
Author:Steven Phillips
Committer:Steven Phillips

DRILL-3229: Implement Union type vector

Commit:fab061e
Author:Sudheesh Katkam
Committer:adeneche

DRILL-3340: Part 2: Reverting 1a589ab and committing latest patch Add operator metrics registry for metric definitions + Display metrics as a table within an operator profile panel + Rename FragmentStats#getOperatorStats to newOperatorStats

Commit:1a589ab
Author:Sudheesh Katkam
Committer:adeneche

DRILL-3340: Added operator names and metric names to query profile before writing it to store + Rename: FragmentStats#getOperatorStats => newOperatorStats + Documentation this closes #216

Commit:68c933c
Author:Hanifi Gunes
Committer:Hanifi Gunes

DRILL-2997: Remove references to groupCount from SerializedField

Commit:4ad4261
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-3079: Move execution fragment json parsing from RPC message to fragment start.

Commit:6cc89e9
Author:vkorukanti
Committer:vkorukanti

DRILL-3010: Convert bad command error messages into UserExceptions in SqlHandlers

Commit:f8e5e61
Author:Sudheesh Katkam
Committer:vkorukanti

DRILL-2697: Pauses sites wait indefinitely for a resume signal DrillClient sends a resume signal to UserServer. UserServer triggers a resume call in the correct Foreman. Foreman resumes all pauses related to the query through the Control layer. + Better error messages and more tests in TestDrillbitResilience and TestPauseInjection + Added execution controls to operator context + Removed ControlMessageHandler interface, renamed ControlHandlerImpl to ControlMessageHandler + Added CountDownLatchInjection, useful in cases like ParititionedSender that spawns multiple threads

Commit:42d5f81
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-2981: Add queries log. Update profile to store normal and verbose exception as well as node and errorid.

Commit:960f876
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-2971, DRILL-2886, DRILL-2778, DRILL-2545: Improve RPC connection detection failure. Add RPC timeout.

Commit:703314b
Author:vkorukanti
Committer:vkorukanti

DRILL-2902: Add support for context functions: user (synonyms session_user and system_user) and current_schema

Commit:c0d5a69
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-2762: Update Fragment state reporting and error collection DeferredException - Add new throwAndClear operation on to allow checking for exceptions preClose in FragmentContext - Add new getAndClear operation BufferManager - Ensure close() can be called multiple times by clearing managed buffer list on close(). FragmentContext/FragmentExecutor - Update FragmentContext to have a preClose so that we can check closure state before doing final close. - Update so that there is only a single state maintained between FragmentContext and FragmentExecutor - Clean up FragmentExecutor run() method to better manage error states and have only single terminal point (avoiding multiple messages to Foreman). - Add new CANCELLATION_REQUESTED state for FragmentState. - Move all users of isCancelled or isFailed in main code to use shouldContinue() - Update receivingFragmentFinished message to not cancel fragment (only inform root operator of cancellation) WorkManager Updates - Add new afterExecute command to the WorkManager ExecutorService so that we get log entries if a thread leaks an exception. (Otherwise logs don't show these exceptions and they only go to standard out.) Profile Page - Update profile page to show last update and last progress. - Change durations to non-time presentation Foreman/QueryManager - Extract listenable interfaces into anonymous inner classes from body of Foreman QueryManager - Update QueryManager to track completed nodes rather than completed fragments using NodeTracker - Update DrillbitStatusListener to decrement expected completion messages on Nodes that have died to avoid query hang when a node dies FragmentData/MinorFragmentProfile - Add ability to track last status update as well as last time fragment made progress AbstractRecordBatch - Update awareness of current cancellation state to avoid cancellation delays Misc. Other changes - Move ByteCode optimization code to only record assembly and code as trace messages - Update SimpleRootExec to create fake ExecutorState to make existing tests work. - Update sort to exit prematurely in the case that the fragment was asked to cancel. - Add finals to all edited files. - Modify control handler and FragmentManager to directly support receivingFragmentFinished - Update receiver propagation message to avoid premature removal of fragment manager - Update UserException.Builder to log a message if we're creating a new UserException (ERROR for System, INFO otherwise). - Update Profile pages to use min and max instead of sorts.

Commit:238399d
Author:adeneche
Committer:Parth Chandra

DRILL-2675 (PART-2): Implement a subset of User Exceptions to improve how errors are reported to the user Added missing changes from committed patch

Commit:55a9a59
Author:Andrew Selden
Committer:Steven Phillips

DRILL-1512: Avro record reader Reader for Avro data files. Supports: - All primitive types - Arrays - Nested records - Enums Unimplemented: - Endpoint affinity - Recursive data types - Complex types: Maps, Fixed, Unions

Commit:99b6d0e
Author:adeneche
Committer:Jacques Nadeau

DRILL-2675: Implement a subset of User Exceptions to improve how errors are reported to the user

Commit:931ed64
Author:Mehant Baid
Committer:Mehant Baid

DRILL-2715: Implement nested loop join operator

Commit:a218ee3
Author:vkorukanti
Committer:Parth Chandra

DRILL-2673: Update UserServer <==> UserClient RPC to better handle handshake response

Commit:1d9d82b
Author:adeneche
Committer:Parth Chandra

DRILL-2498: Separate QueryResult into two messages QueryResult and QueryData

Commit:2da618c
Author:Chris Westin
Committer:Jacques Nadeau

DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness. I did my best to keep these clean when I split them up, but this core commit may depend on some minor changes in the hygiene commit that is also associated with this bug, so either both should be applied, or neither. The core commit should be applied first. protocol/pom.xml - updated protocol buffer compiler version to 2.6 - this made slight modifications to the formats of a few committed protobuf files AutoCloseables - created org.apache.drill.common.AutoCloseables to handle closing these quietly BaseTestQuery, and derivatives - factored out pieces into QueryTestUtil so they can be reused DeferredException: - created this so we can collect exceptions during the shutdown process Drillbit - uses AutoCloseables for the WorkManager and for the storeProvider - allow start() to take a RemoteServiceSet - private, final, formatting Foreman - added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent the time between request of a cancellation, and acknowledgement from all remote endpoints running fragments on a query's behalf - created ForemanResult to manage interleaving cleanup effects/failure with query result state - does not need to implement Comparable - does not need to implement Closeable - thread blocking fixes - add resultSent flag - add code to log plan fragments with endpoint assignments - added finals, cleaned up formatting - do queue management in acquireQuerySemaphore; local tests pass - rename getContext() to getQueryContext() - retain DrillbitContext - a couple of exception injections for testing - minor formatting - TODOs FragmentContext - added a DeferredException to collect errors during startup/shutdown sequences FragmentExecutor - eliminated CancelableQuery - use the FragmentContext's DeferredException for errors - common subexpression elimination - cleaned up QueryContext - removed unnecessary functions (with some outside classes tweaked for this) - finals, formatting QueryManager - merge in QueryStatus - affects Foreman, ../batch/ControlHandlerImpl, and ../../server/rest/ProfileResources - made some methods private - removed unused imports - add finals and formatting - variable renaming to improve readability - formatting - comments - TODOs QueryStatus - getAsInfo() private - member renaming - member access changes - formatting - TODOs QueryTestUtil, BaseTestQuery, TestDrillbitResilience - make maxWidth a parameter to server startup SelfCleaningRunnable - created org.apache.drill.common.SelfCleaningRunnable SingleRowListener - created org.apache.drill.SingleRowListener results listener - use in TestDrillbitResilience TestComparisonFunctions - fix not to close the FragmentContext multiple times TestDrillbitResilience - created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit resilience in the face of exceptions and failures during queries TestWithZookeeper - factor out work into ZookeeperHelper so that it can be reused by TestDrillbitResilience UserBitShared - get rid of unused UNKNOWN_QUERY WorkEventBus - rename methods, affects Foreman and ControlHandlerImpl - remove unused WorkerBee reference - most members final - formatting WorkManager - Closeable to AutoCloseable - removed unused incomingFragments Set - eliminated unnecessary eventThread and pendingTasks by posting Runnables directly to executor - use SelfCleaningRunnable for Foreman management - FragmentExecutor management uses SelfCleaningRunnable - runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes - other improvements due to bee no longer needed in various places - most members final - minor formatting - comments - TODOs (*) Created exception injection classes to simulate exceptions for testing - ExceptionInjection - ExceptionInjector - ExceptionInjectionUtil - TestExceptionInjection DRILL-2245-hygiene: General code cleanup encountered while working on the rest of this commit. This includes - making members final whenever possible - making members private whenever possible - making loggers private - removing unused imports - removing unused private functions - removing unused public functions - removing unused local variables - removing unused private members - deleting unused files - cleaning up formatting - adding spaces before braces in conditionals and loop bodies - breaking up overly long lines - removing extra blank lines While I tried to keep this clean, this commit may have minor dependencies on DRILL-2245-core that I missed. The intention is just to break this up for review purposes. Either both commits should be applied, or neither.

Commit:9fd1430
Author:Jacques Nadeau
Committer:vkorukanti

DRILL-2187: Single Broadcast Sender Also includes: 1. Fix merge join planning issue (1c5c810 by jinfengni) 2. ExternalSort: Check the memory available for in-memory sorting or not in making decision to spill or not (36f9dd1) 3. Cleanup in ExternalSortBatch and its helper classes (36f9dd1) 4. MergeJoinBatch: Limit the outgoing record batch size to 2^15 (37dfeb8) 5. StreamingAggBatch: Limit outgoing record batch size to 2^15 (7d8a2e4)

Commit:8cdab2e
Author:vkorukanti
Committer:vkorukanti

DRILL-1990: Add peak memory allocation in a operator to OperatorStats.

Commit:2eb72a7
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-1684, DRILL-1517, DRILL-1350: Profile and cancellation updates - Remove any storage of persisted profiles. - Store a separate query info object for active queries. - Update cancellation and running profile loading to query foreman server. - Make file store support HDFS APIs - Update PStoreProvider to use configuration to decide if you want PERSISTENT, EPHEMERAL, or BLOB storage rather than separate interfaces. - Update ZkPStore's persistent mode to leverage a cache and respond to changes rather than actively probing values. - Update ZkPStore's cache to be effectively write-through. - Automatically delete deprecated or default value options from PStore.

Commit:451dd60
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-1436: Remove use of UDP based cache for purposes of intermediate PlanFragment distribution Includes: - Remove dependency on Infinispan - Update initialize fragments to send in batches. - Update RPC layer to capture UserRpcExceptions and propagate back. - Send full stack trace in DrillPBError and let foreman node decide on formatting. - Increment control rpc version - Update systables to report current drillbit and version

Commit:2eb04e7
Author:Steven Phillips
Committer:Steven Phillips

DRILL-1425: Handle unknown operators in web ui

Commit:8def6e9
Author:Timothy Chen
Committer:Steven Phillips

Patch for DRILL-705 Currently only supports partitioning/ordering, not yet preceding or after offsets

Commit:c331aed
Author:Steven Phillips
Committer:Steven Phillips

DRILL-991: Limit should terminate upstream fragments immediately upon completion

Commit:669bd04
Author:Mehant Baid
Committer:Aditya Kishore

DRILL-1126: Support generic objects as workspace variables in UDAF

Commit:208d511
Author:Steven Phillips
Committer:Jacques Nadeau

DRILL-1055: Add ProducerConsumer operator to scans This can be disabled. The queue size is configurable

Commit:16808f4
Author:Jacques Nadeau
Committer:Jacques Nadeau

DRILL-1069: Rename RandomReceiver to UnorderedRecevier.

Commit:2e07b0b
Author:Aditya Kishore
Committer:Aditya Kishore

DRILL-836: [addendum] Drill needs to return complex types (e.g., map and array) as a JSON string * This contains additional changes to the original patch which was merged. + Renamed "flatten" to "complex-to-json" + With the new patch, we return VARCHAR instead of VARBINARY. + Added test case. + Minor code re-factoring.

Commit:fc1a777
Author:Steven Phillips
Committer:Jacques Nadeau

Fix and improve runtime stats profiles - Stop stats processing while waiting for next. - Fix stats collection in PartitionSender and ScanBatch - Add stats to all senders - Add wait time to operator profile.

Commit:fc00bc4
Author:Aditya Kishore
Committer:Aditya Kishore

DRILL-836: Drill needs to return complex types (e.g., map and array) as a JSON string

Commit:e46f1be
Author:Jacques Nadeau
Committer:Jacques Nadeau

Enable editing of storage plugin via http

Commit:2dec152
Author:Steven Phillips
Committer:Jacques Nadeau

Use PStore for profiles, partial profiles on running queries Full profiles through rpc layer