Apache Drill 1.9.0 Release Notes

Release date: November 29, 2016

Today, we’re happy to announce the availability of Drill 1.9.0. You can download it here.

New Features

This release of Drill provides the following new features:

  • Asynchronous Parquet reader
  • Parquet filter pushdown
  • Dynamic UDF support
  • HTTPD format plugin

The following sections list additional bug fixes and improvements:

Sub-task

  • [DRILL-4420] - C client and ODBC driver should move to using the new metadata methods provided by DRILL-4385
  • [DRILL-4452] - Update avatica version for Drill jdbc
  • [DRILL-4560] - ZKClusterCoordinator does not call DrillbitStatusListener.drillbitRegistered for new bits
  • [DRILL-4968] - Add column size information to ColumnMetadata
  • [DRILL-4969] - Basic implementation for displaySize

Bug

  • [DRILL-1996] - C++ Client: Make Cancel API Public
  • [DRILL-3898] - No space error during external sort does not cancel the query
  • [DRILL-4203] - Parquet File : Date is stored wrongly
  • [DRILL-4369] - Database driver fails to report any major or minor version information
  • [DRILL-4370] - DatabaseMetadata returning <Properties resource apache-drill-jdbc.properties not loaded>
  • [DRILL-4525] - Query with BETWEEN clause on Date and Timestamp values fails with Validation Error
  • [DRILL-4542] - if external sort fails to spill to disk, memory is leaked and wrong error message is displayed
  • [DRILL-4618] - random numbers generator function broken
  • [DRILL-4763] - Parquet file with DATE logical type produces wrong results for simple SELECT
  • [DRILL-4767] - Parquet reader throw IllegalArgumentException for int32 type with GZIP compression
  • [DRILL-4769] - forman spins query int32 data with snappy compression
  • [DRILL-4770] - ParquetRecordReader throws NPE querying a single int64 column file
  • [DRILL-4823] - Fix OOM while trying to prune partitions with reasonable data size
  • [DRILL-4824] - JSON with complex nested data produces incorrect output with missing fields
  • [DRILL-4826] - Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases
  • [DRILL-4862] - wrong results - use of convert_from(binary_string(key),'UTF8') in filter results in wrong results
  • [DRILL-4870] - drill-config.sh sets JAVA_HOME incorrectly for the Mac
  • [DRILL-4874] - "No UserGroupInformation while generating ORC splits" - hive known issue in 1.2.0-mapr-1607 release.
  • [DRILL-4877] - max(dir0), max(dir1) query against parquet data slower by 2X
  • [DRILL-4880] - Support JDBC driver registration using ServiceLoader
  • [DRILL-4884] - Fix IOB exception in limit n query when n is beyond 65535.
  • [DRILL-4888] - putIfAbsent for ZK stores is not atomic
  • [DRILL-4894] - Fix unit test failure in 'storage-hive/core' module
  • [DRILL-4905] - Push down the LIMIT to the parquet reader scan to limit the numbers of records read
  • [DRILL-4906] - CASE Expression with constant generates class exception
  • [DRILL-4911] - SimpleParallelizer should avoid plan serialization for logging purpose when debug logging is not enabled.
  • [DRILL-4921] - Scripts drill_config.sh, drillbit.sh, and drill-embedded fail when accessed via a symbolic link
  • [DRILL-4925] - Add types filter to getTables metadata API
  • [DRILL-4930] - Metadata results are not sorted
  • [DRILL-4934] - ServiceEngine does not use property useIP for DrillbitStartup
  • [DRILL-4941] - UnsupportedOperationException : CASE WHEN true or null then 1 else 0 end
  • [DRILL-4945] - Missing subtype information in metadata returned by prepared statement
  • [DRILL-4950] - Consume Spurious Empty Batches in JDBC
  • [DRILL-4954] - allTextMode in the MapRDB plugin always return nulls
  • [DRILL-4964] - Drill fails to connect to hive metastore after hive metastore is restarted unless drillbits are restarted also
  • [DRILL-4972] - Drillbit shuts down immediately after starting if embedded web server is disabled
  • [DRILL-4974] - NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
  • [DRILL-4989] - Fix TestParquetWriter.testImpalaParquetBinaryAsTimeStamp_DictChange
  • [DRILL-4990] - Use new HDFS API access instead of listStatus to check if users have permissions to access workspace.
  • [DRILL-4993] - Documentation: Wrong output displayed for convert_from() with a map
  • [DRILL-5004] - Parquet date correction gives null pointer exception if there is no createdBy entry in the metadata
  • [DRILL-5007] - Dynamic UDF lazy-init does not work correctly in multi-node cluster
  • [DRILL-5009] - Query with a simple join fails on Hive generated parquet
  • [DRILL-5047] - When session option is string, query profile is displayed incorrectly on Web UI

Improvement

  • [DRILL-1950] - Implement filter pushdown for Parquet
  • [DRILL-3178] - csv reader should allow newlines inside quotes
  • [DRILL-4653] - Malformed JSON should not stop the entire query from progressing
  • [DRILL-4674] - Allow casting to boolean the same literals as in Postgre
  • [DRILL-4752] - Remove submit_plan script from Drill distribution
  • [DRILL-4771] - Drill should avoid doing the same join twice if count(distinct) exists
  • [DRILL-4792] - Include session options used for a query as part of the profile
  • [DRILL-4800] - Improve parquet reader performance
  • [DRILL-4927] - Add support for Null Equality Joins
  • [DRILL-4967] - Adding template_name to source code generated using freemarker template
  • [DRILL-4986] - Allow users to customize the Drill log file name
  • [DRILL-4987] - Use ImpersonationUtil in RemoteFunctionRegistry
  • [DRILL-5031] - Documentation for HTTPD Parser

New Feature

  • [DRILL-1268] - C++ Client. Write Unit Test for Drill Client
  • [DRILL-3423] - Add New HTTPD format plugin
  • [DRILL-4714] - Add metadata and prepared statement APIs to DrillClient<->Drillbit interface
  • [DRILL-4726] - Dynamic UDFs support

Task

  • [DRILL-4853] - Update C++ protobuf source files
  • [DRILL-4886] - Merge maprdb format plugin source code