Apache Drill 1.13.0 Release Notes

Release date: March 18, 2018

Today, we’re happy to announce the availability of Drill 1.13.0. You can download it here.

New Features and Improvements

This release of Drill provides the following new features and improvements:

  • JDK 8 support. (DRILL-1491)
  • Upgrade to Calcite version 1.15. (DRILL-3993)
  • JDBC Statement.setQueryTimeout(int) support to cancel queries if they do not complete within the specified time. (DRILL-3640)
  • Batch processing improvements that enable you to limit the amount of memory that the Flatten, Merge Join, and External Sort operators allocate to outgoing batches. (DRILL-6123)
  • Enhanced DESCRIBE command. (DRILL-4559)
  • Support for SPNEGO to extend Kerberos to Web applications through HTTP. (DRILL-5425)
  • Ability to run Drill under YARN. (DRILL-1170)
  • Parquet filter pushdown support for IS [NOT] NULL, TRUE, and FALSE operators and implicit and explicit casts for timestamp, date, and time data types. (DRILL-6174)
  • Performance improvements with support for project push down, filter push down, and partition pruning on dynamically expanded columns when represented as a star in the ITEM operator. (DRILL-6118)
  • The Hive client for Drill is updated to version 2.3.2. With the update, Drill supports queries on transactional (ACID) and non-transactional Hive bucketed ORC tables. The updated libraries are backward compatible with earlier versions of the Hive server and metastore. (DRILL-5978)
  • Ability to automatically manage memory allocations during Drill startup. (DRILL-5741)
  • Ability to query an empty directory and use it for queries with any JOIN and UNION (UNION ALL) operators. (Drill-4185)
  • Non-numeric support for JSON processing. (Drill-5919)
  • New options to that enable you to configure the number of Jetty acceptors and selectors (DRILL-5994)
  • Support SQL syntax highlighting of queries, auto-complete support in SQL editors, and snippets. (DRILL-5868)
  • Improved performance of the Single Merge Exchange operator. (DRILL-6115)
  • Like operator optimization. DRILL-5879
  • User/Distribution-specific configuration checks during startup (DRILL-5741).

The following sections list all fixes and improvements:

Sub-task

  • [DRILL-4333] - tests in Drill2489CallsAfterCloseThrowExceptionsTest fail in Java 8
  • [DRILL-5068] - Create a system table for completed profiles - sys.profiles
  • [DRILL-6036] - Create a sys.connections table

Bug

  • [DRILL-4120] - dir0 does not work when the directory structure contains Avro files
  • [DRILL-4329] - 13 Unit tests are failing with JDK 8
  • [DRILL-4469] - SUM window query returns incorrect results over integer data
  • [DRILL-4708] - connection closed unexpectedly
  • [DRILL-4923] - Use of CASE WHEN inside a sub-query results in AssertionError
  • [DRILL-4942] - incorrect result - case when (not null is null) then true else false end
  • [DRILL-5170] - JMockit-based unit tests fail when run under Java 8
  • [DRILL-5286] - When rel and target candidate set is the same, planner should not need to do convert for the relNode since it must have been done
  • [DRILL-5377] - Five-digit year dates are displayed incorrectly via jdbc
  • [DRILL-5690] - RepeatedDecimal18Vector does not pass scale, precision to data vector
  • [DRILL-5730] - Fix Unit Test failures on JDK 8 And Some JDK 7 versions
  • [DRILL-5768] - Drill planer should not allow select * with group by clause
  • [DRILL-5833] - Parquet reader fails with assertion error for Decimal9, Decimal18 types
  • [DRILL-5851] - Empty table during a join operation with a non empty table produces cast exception
  • [DRILL-5880] - java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join
  • [DRILL-5902] - Queries encounter random failure due to RPC connection timed out
  • [DRILL-5926] - TestValueVector tests fail sporadically
  • [DRILL-5961] - For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments
  • [DRILL-5963] - Canceling a query hung in planning state, leaves the query in ENQUEUED state for ever.
  • [DRILL-5967] - Memory leak by HashPartitionSender
  • [DRILL-5971] - Fix INT64, INT32 logical types in complex parquet reader
  • [DRILL-5996] - Unable to re-run queries from Profiles tab with impersonation and without authentication
  • [DRILL-6020] - NullPointerException with Union setting on when querying JSON untyped path
  • [DRILL-6021] - Show shutdown button when authentication is not enabled
  • [DRILL-6040] - Need to add usage for graceful_stop to drillbit.sh
  • [DRILL-6044] - Shutdown button does not work from WebUI
  • [DRILL-6054] - Issues in FindPartitionConditions
  • [DRILL-6070] - Hash join with empty tables should not do casting of data types to INT<OPTIONAL>
  • [DRILL-6079] - Memory leak caused by ParquetRowGroupScan
  • [DRILL-6080] - Sort incorrectly limits batch size to 65535 records rather than 65536
  • [DRILL-6081] - Duration of completed queries is continuously increasing
  • [DRILL-6083] - RestClientFixture does not connect to the correct webserver port
  • [DRILL-6085] - Travis build sometimes fails becomes vm suddenly exits.
  • [DRILL-6088] - MainLoginPageModel errors out when http.auth.mechanisms is not configured
  • [DRILL-6090] - While connecting to drill-bits using JDBC Driver through Zookeeper, a lot of "Curator-Framework-0" threads are created if connection to drill-bit is not successful(no drill-bits are up/reachable)
  • [DRILL-6093] - Unneeded columns in Drill logical project
  • [DRILL-6099] - Drill does not push limit past project (flatten) if it cannot be pushed into scan
  • [DRILL-6100] - Intermittent failure while reading Parquet file footer during planning phase
  • [DRILL-6111] - NullPointerException with Kafka Storage Plugin
  • [DRILL-6117] - Query End time is not being set
  • [DRILL-6119] - The OpenTSDB storage plugin is not included in the Drill distribution
  • [DRILL-6124] - testCountDownLatch can be null in PartitionerDecorator depending on user's injection controls config
  • [DRILL-6127] - NullPointerException happens when submitting physical plan to the Hive storage plugin
  • [DRILL-6128] - Wrong Result with Nested Loop Join
  • [DRILL-6129] - Query fails on nested data type schema change
  • [DRILL-6137] - Join Failure When Some Json File Partitions Empty
  • [DRILL-6140] - Operators listed in Profiles Page doesn't always correspond with operator specified in Physical Plan
  • [DRILL-6143] - Make Fragment Runner's RPC Timeout a SystemOption
  • [DRILL-6144] - Make directMemory amount configurable in tests
  • [DRILL-6148] - TestSortSpillWithException is sometimes failing.
  • [DRILL-6151] - Fragment executors may terminate without sending final batch to a downstream causing query to hang
  • [DRILL-6154] - NaN, Infinity issues
  • [DRILL-6164] - Heap memory leak during parquet scan and OOM
  • [DRILL-6172] - setValueCount of VariableLengthVectors throws IOB exception when called with 0 value after clearing vectors
  • [DRILL-6185] - Error is displaying while accessing query profiles via the Web-UI
  • [DRILL-6187] - Exception in RPC communication between DataClient/ControlClient and respective servers when bit-to-bit security is on
  • [DRILL-6188] - Fix C++ client build on Centos 7 and OSX
  • [DRILL-6189] - Security: passwords logging and file permisions
  • [DRILL-6190] - Packets can be bigger than strictly legal
  • [DRILL-6191] - Need more information on TCP flags
  • [DRILL-6192] - Drill is vulnerable to CVE-2017-12197
  • [DRILL-6195] - Quering Hive non-partitioned transactional tables via Drill
  • [DRILL-6197] - Duplicate entries in inputProfiles of minor fragments for specific operators
  • [DRILL-6198] - OpenTSDB unit tests fail when Lilith client is run
  • [DRILL-6204] - Pass tables columns without partition columns to empty Hive reader
  • [DRILL-6205] - Reduce memory consumption of testFlattenUpperLimit test
  • [DRILL-6216] - Metadata mismatch when connecting to a Drill 1.12.0 with a Drill-1.13.0-SNAPSHOT driver
  • [DRILL-6217] - NaN/Inf: NestedLoopJoin processes NaN values incorrectly
  • [DRILL-6226] - Projection pushdown does not occur with Calcite upgrade

New Feature

  • [DRILL-1170] - YARN support for Drill
  • [DRILL-5425] - Support HTTP Kerberos auth using SPNEGO
  • [DRILL-5868] - Support SQL syntax highlighting of queries
  • [DRILL-6068] - Drill should support user/distribution specific configuration checks during startup

Improvement

  • [DRILL-3640] - Drill JDBC driver support Statement.setQueryTimeout(int)
  • [DRILL-3958] - Improve error message when JDBC driver not found
  • [DRILL-4185] - UNION ALL involving empty directory on any side of union all results in Failed query
  • [DRILL-4559] - Re-work DESCRIBE command after moving to Calcite master branch
  • [DRILL-5688] - Add repeated map support to column accessors
  • [DRILL-5741] - Automatically manage memory allocations during startup
  • [DRILL-5879] - Optimize "Like" operator
  • [DRILL-5919] - Add non-numeric support for JSON processing
  • [DRILL-5966] - Upgrade DRILL to Calcite 1.13.0
  • [DRILL-5973] - Support injection of time-bound pauses in server
  • [DRILL-5974] - Read JSON non-relational fields using text mode
  • [DRILL-5978] - Updating of Apache and MapR Hive libraries to 2.3.2 and 2.1.1-mapr-1710 versions respectively
  • [DRILL-5993] - Improve Performance of Copiers used by SV Remover, Top N etc.
  • [DRILL-5994] - enable configuring number of Jetty acceptors and selectors
  • [DRILL-6002] - Avoid memory copy from direct buffer to heap while spilling to local disk
  • [DRILL-6025] - Execution time of a running query shown as 'NOT AVAILABLE'
  • [DRILL-6028] - Allow splitting generated code in ChainedHashTable into blocks to avoid "code too large" error
  • [DRILL-6049] - Rollup of hygiene changes from "batch size" project
  • [DRILL-6063] - Set correct ThreadContext ClassLoader before using Hadoop Configuration class in DrillClient
  • [DRILL-6071] - Limit batch size for flatten operator
  • [DRILL-6089] - Validate That Planner Does Not Assume HashJoin Preserves Ordering for FS, MaprDB, or Hive
  • [DRILL-6102] - CurrentModification Exception in BaseAllocator when debugging
  • [DRILL-6106] - Use valueOf method instead of constructor since valueOf has a higher performance by caching frequently requested values.
  • [DRILL-6114] - Complete internal metadata layer for improved batch handling
  • [DRILL-6115] - SingleMergeExchange is not scaling up when many minor fragments are allocated for a query.
  • [DRILL-6118] - Handle item star columns during project / filter push down and directory pruning
  • [DRILL-6123] - Limit batch size for Merge Join based on memory
  • [DRILL-6126] - Allocate memory for value vectors upfront in flatten operator
  • [DRILL-6153] - Revised operator framework
  • [DRILL-6163] - Switch Travis To Java 8
  • [DRILL-6174] - Parquet pushdown planning improvements
  • [DRILL-6177] - Merge Join - Allocate memory for outgoing value vectors based on sizes of incoming batches.
  • [DRILL-6180] - Use System Option "output_batch_size" for External Sort
  • [DRILL-6210] - Enhance the test schema builder for remaining types

Task

  • [DRILL-1491] - Support for JDK 8
  • [DRILL-3993] - Rebase Drill on Calcite master branch
  • [DRILL-6130] - Fix NPE during physical plan submission for various storage plugins
  • [DRILL-6138] - Move RecordBatchSizer to org.apache.drill.exec.record package
  • [DRILL-6208] - Fix FunctionInitializerTest#testConcurrentFunctionBodyLoad to use Mockito instead of JMockit
  • [DRILL-6213] - During restart drillbit should be killed forcefully if exceeds allowed timeout
  • [DRILL-6218] - Update release profile to not generate MD5 checksum