Apache Drill 1.12.0 Release Notes

Release date: December 15, 2017

Today, we're happy to announce the availability of Drill 1.12.0. You can download it here.

New Features and Improvements

This release of Drill provides the following new features and improvements:

  • Kafka and OpenTSDB storage plugins (DRILL-4779, DRILL-5337)
  • SSL/TLS support (DRILL-5431)
  • Network encryption support (DRILL-5682)
  • Queue-based memory assignment for buffering operators (DRILL-5716)
  • A collection of networking functions that facilitate network analysis using Drill (DRILL-5834)
  • Support for the libpam4j PAM authenticator (DRILL-5820)
  • Filter pushdown for Parquet can handle files with multiple rowgroups (DRILL-5795)
  • UTF-8 is enabled in the query string by default (DRILL-5772)
  • IF NOT EXISTS support for CREATE TABLE and CREATE VIEWS (DRILL-5952)
  • Geometry functions, ST_AsGeoJSON and ST_AsJSON, that return GeoJSON and JSON representations (DRILL-5962, DRILL-5960)
  • JMX metrics for failed and canceled queries (DRILL-5909)
  • Syntax highlighting and error checking for storage plugin configurations (DRILL-5981)
  • System options improvements, including a new internal system options table (DRILL-5723)
  • Ability to prevent users from accessing a path outside the current workspace (DRILL-5964)
  • Ability to put the server in quiescent mode for a graceful shutdown (DRILL-4286)
  • The Drill Web Console lists the completion of successfully completed queries as "successful" (DRILL-5923)

The following sections list additional bug fixes and improvements:

Sub-task

  • [DRILL-5253] - External sort fails with OOM error (Fails to allocate sv2)
  • [DRILL-5801] - Label the Gantt chart that indicates the lifespan details of the various minor fragments
  • [DRILL-5802] - Provide a sortable table for tables within a query profile
  • [DRILL-5803] - Show the hostname for each minor fragment in operator table
  • [DRILL-5867] - List profiles in pages rather than a long verbose listing
  • [DRILL-5881] - Java Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit
  • [DRILL-5882] - C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit
  • [DRILL-6031] - Document support for dots in field names

Bug

  • [DRILL-1051] - Casting timestamp as date gives wrong result for dates earlier than 1883
  • [DRILL-1499] - Different column order could appear in the result set for a schema-less select * query, even there are no changing schemas.
  • [DRILL-3241] - Query with window function runs out of direct memory and does not report back to client that it did
  • [DRILL-3407] - CTAS Auto Partition : The plan for count(*) should show the list of files scanned
  • [DRILL-3449] - When Foreman node dies, the FragmentExecutor still tries to send status updates to Foreman
  • [DRILL-3827] - Empty metadata file causes queries on the table to fail
  • [DRILL-3829] - Metadata Caching : Drill should ignore a corrupted cache file
  • [DRILL-4139] - Fix parquet partition pruning for BIT, INTERVAL and DECIMAL types
  • [DRILL-4255] - SELECT DISTINCT query over JSON data returns UNSUPPORTED OPERATION
  • [DRILL-4640] - Unable to submit physical plan from Web UI on Windows
  • [DRILL-4686] - Aggregation query over HBase table results in IllegalStateException: Failure while reading vector
  • [DRILL-4734] - Query against HBase table on a 5 node cluster fails with SchemaChangeException
  • [DRILL-4735] - Count(dir0) on parquet returns 0 result
  • [DRILL-5002] - Using hive's date functions on top of date column gives wrong results for local time-zone
  • [DRILL-5146] - Unnecessary spilling to disk by sort when we only have 5000 rows with one column
  • [DRILL-5166] - Select with options returns NPE
  • [DRILL-5185] - Union all not passing type info when the output contains 0 rows
  • [DRILL-5268] - SYSTEM ERROR: UnsupportedOperationException: Unable to get size for minor type [MAP] and mode [REQUIRED]
  • [DRILL-5269] - SYSTEM ERROR: JsonMappingException: No suitable constructor found for type [simple type, class org.apache.drill.exec.store.direct.DirectSubScan]
  • [DRILL-5327] - Hash aggregate can return empty batch which can cause schema change exception
  • [DRILL-5339] - Web UI: flaws in query profile display on error
  • [DRILL-5341] - Web UI: remove duplicating link to documentation in Options page
  • [DRILL-5346] - Web UI: remove link on user in query profile list
  • [DRILL-5357] - Partition pruning information not available in query plan for COUNT aggregate query
  • [DRILL-5416] - Vectors read from disk report incorrect memory sizes
  • [DRILL-5442] - Managed Sort: IndexOutOfBounds with a join over an inlist
  • [DRILL-5443] - Managed External Sort fails with OOM while spilling to disk
  • [DRILL-5445] - Assertion Error in Managed External Sort when dealing with repeated maps
  • [DRILL-5447] - Managed External Sort : Unable to allocate sv2 vector
  • [DRILL-5464] - Fix JSON reader when it deals with empty file
  • [DRILL-5465] - Managed external sort results in an OOM
  • [DRILL-5478] - Spill file size parameter is not honored by the managed external sort
  • [DRILL-5480] - Empty batch returning from HBase may cause SchemaChangeException even when data does not have different schema
  • [DRILL-5507] - Millions of "Failure finding Drillbit running on host" info messages in foreman logs
  • [DRILL-5513] - Managed External Sort : OOM error during the merge phase
  • [DRILL-5519] - Sort fails to spill and results in an OOM
  • [DRILL-5525] - Inconsistent, unhelpful semantics for batch, field schema comparison
  • [DRILL-5546] - Schema change problems caused by empty batch
  • [DRILL-5547] - Drill config options and session options do not work as intended
  • [DRILL-5564] - IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space (16674816) + prealloc space (0) + child space (0) != allocated (16740352)
  • [DRILL-5582] - [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit
  • [DRILL-5594] - Excessive buffer reallocations during merge phase of external sort
  • [DRILL-5597] - Incorrect "bits" vector allocation in nullable vectors allocateNew()
  • [DRILL-5602] - Repeated List Vector fails to initialize the offset vector
  • [DRILL-5617] - Spill file name collisions when spill file is on a shared file system
  • [DRILL-5645] - negation of expression causes null pointer exception
  • [DRILL-5660] - Drill 1.10 queries fail due to Parquet Metadata "corruption" from DRILL-3867
  • [DRILL-5663] - Drillbit fails to start when only keystore path is provided without keystore password.
  • [DRILL-5675] - Drill C++ Client Date Time Literals Support Metadata Mapping is Incorrect
  • [DRILL-5686] - Warning for sasl.max_wrapped_size contain incorrect syntax
  • [DRILL-5687] - Disable TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
  • [DRILL-5694] - hash agg spill to disk, second phase OOM
  • [DRILL-5698] - Drill should start in embedded mode using java 1.8.0_144
  • [DRILL-5699] - Drill Web UI Page Source Has Links To External Sites
  • [DRILL-5701] - drill.connections.rpc.<user/control/data>.<encrypted/unencrypted> metric not behaving correctly
  • [DRILL-5702] - Jdbc Driver Class not found
  • [DRILL-5710] - drill-config.sh incorrectly exits with Java 1.7 or later is required to run Apache Drill
  • [DRILL-5714] - Fix NPE when mapr-db plugin is used in table function
  • [DRILL-5715] - Performance of refactored HashAgg operator regressed
  • [DRILL-5717] - change some date time unit cases with specific timezone or Local
  • [DRILL-5721] - Query with only root fragment and no non-root fragment hangs when Drillbit to Drillbit Control Connection has network issues
  • [DRILL-5725] - Update Jackson version to 2.7.8
  • [DRILL-5727] - Update release profile to generate SHA-512 checksum.
  • [DRILL-5729] - Fix Travis Checks
  • [DRILL-5737] - Hash Agg uses more than the allocated memory under certain low memory conditions
  • [DRILL-5740] - hash agg fail to read spill file
  • [DRILL-5743] - Using order by clause in a select * query on hbase table returns only the row_key and order by field(s)
  • [DRILL-5744] - External sort fails with OOM error
  • [DRILL-5745] - Invalid "location" information in Drill web server
  • [DRILL-5746] - Pcap PR manually edited Protobuf files, values lost on next build
  • [DRILL-5749] - Foreman and Netty threads occure deadlock
  • [DRILL-5751] - Fix unit tests to use local file system even if it is not set by default
  • [DRILL-5753] - Managed External Sort: One or more nodes ran out of memory while executing the query.
  • [DRILL-5755] - TOP_N_SORT operator does not free memory while running
  • [DRILL-5757] - CONVERT_TO_JSON function is failed while using non-existence field as a parameter.
  • [DRILL-5761] - Disable Lilith ClassicMultiplexSocketAppender by default
  • [DRILL-5763] - Fix NPE during MapRDBSubScan deserialization
  • [DRILL-5765] - Json query profile is not shown on Web UI
  • [DRILL-5766] - Stored XSS in APACHE DRILL
  • [DRILL-5771] - Fix serDe errors for format plugins
  • [DRILL-5775] - Select * query on a maprdb binary table fails
  • [DRILL-5790] - PCAP format explicitly opens local file
  • [DRILL-5792] - CONVERT_FROM_JSON on an empty file throws runtime exception
  • [DRILL-5804] - External Sort times out, may be infinite loop
  • [DRILL-5811] - Large number of "Failure finding Drillbit" messages when using MFS
  • [DRILL-5812] - Restore the minor fragment Gantt chart Canvas
  • [DRILL-5816] - Hash function produces skewed results on String values with same leading prefix
  • [DRILL-5819] - Default value of security.admin.user_groups and security.admin.users is "true"
  • [DRILL-5822] - The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order
  • [DRILL-5824] - 1st phase Hash Aggregate allocates more memory than the limit
  • [DRILL-5830] - Resolve regressions to MapR DB from DRILL-5546
  • [DRILL-5838] - Fix MaprDB filter pushdown for the case of nested field (reg. of DRILL-4264)
  • [DRILL-5839] - Handle Empty Batches in Merge Receiver
  • [DRILL-5840] - A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper
  • [DRILL-5845] - Columns returned by select with "ORDER BY" and "LIMIT" clauses are not in correct order.
  • [DRILL-5849] - Add freemarker lib to dependencyManagement to ensure proper version is used when resolving dependency version conflicts
  • [DRILL-5853] - Sort removal based on NULL direction
  • [DRILL-5854] - IllegalStateException when empty batch is received.
  • [DRILL-5857] - Fix NumberFormatException in Hive unit tests
  • [DRILL-5859] - Time for query queuing timeout not display correctly in WebUI
  • [DRILL-5863] - Sortable table incorrectly sorts minor fragments and time elements lexically instead of sorting by implicit value
  • [DRILL-5864] - Selecting a non-existing field from a MapR-DB JSON table fails with NPE
  • [DRILL-5872] - Deserialization of profile JSON fails due to totalCost being reported as "NaN"
  • [DRILL-5874] - NPE in AnonWebUserConnection.cleanupSession()
  • [DRILL-5878] - TableNotFound exception is being reported for a wrong storage plugin.
  • [DRILL-5887] - Display process user/ groups in Drill UI
  • [DRILL-5888] - jdbc-all-jar unit tests broken because of dependency on hadoop.security
  • [DRILL-5890] - Tests Leak Many Open File Descriptors
  • [DRILL-5895] - Fix unit tests for mongo storage plugin
  • [DRILL-5896] - Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
  • [DRILL-5898] - Query returns columns in the wrong order
  • [DRILL-5906] - java.lang.NullPointerException while quering Hive ORC tables on MapR cluster.
  • [DRILL-5923] - State of a successfully completed query shown as "COMPLETED"
  • [DRILL-5941] - Skip header / footer logic works incorrectly for Hive tables when file has several input splits
  • [DRILL-5972] - Slow performance for query on INFORMATION_SCHEMA.TABLE
  • [DRILL-5979] - unnable to access temporary table from drill web interface
  • [DRILL-5986] - Update jackson-databind version to 2.7.9.1
  • [DRILL-5987] - Two versions of javassist on the classpath
  • [DRILL-6006] - Label current is missing on Web UI since the same drillbits with different status are considered to be different
  • [DRILL-6007] - web ui index page is refreshed at high pace
  • [DRILL-6017] - Fix for SHUTDOWN button being visible for non Admin users
  • [DRILL-6019] - Only admin should be able to access shutdown resources
  • Order by doesn't sort columns when window function is involved in the query
  • MIN MAX DIR tests fail and return incorrect results
  • header.line.count does not work for Hive table with data file size > chunk size
  • lang.NoClassDefFoundError exception not handled in org.apache.drill.exec.rpc.security.ClientAuthenticatorProvider
  • Investigate jackson-databind vulnerabilities CVE-2017-15095 & CVE-2017-7525
  • Null Pointer Exception with query using table function
  • Fix XSS vulnerabilities in Drill
  • Some queries in concurrent execution get stuck in STARTING phase
  • hashagg.num_partitions cannot be set by "alter session" or "alter system"
  • Query with 2-way JOIN fails with "Hash join does not support schema changes"
  • RowKeyJoin is generated instead of IndexScan for queries with derived table/ table functions
  • ConvertCountToDirectScan rule enhancements
  • Drill wrong old date-time values displaying
  • Warning message "Exception while trying to prune partition" when query on view
  • Drill not able to read MapR-DB when FQDN is longer than 64 characters
  • Pam authentication with Centrify doesn't work by JPam restriction
  • Hash aggregate does not support schema changes
  • Simple query with only one join condition failed with "Hash join does not support schema changes"
  • UnsupportedOperationException: Unable to get holder type for minor type [LATE] and mode [OPTIONAL].
  • Query on TPC-H SF100 dataset fails with "Hash join does not support schema changes" [MapR-DB JSON Tables]

Improvement

  • [DRILL-1691] - ConvertCountToDirectScan rule should be applicable for 2 or more COUNT aggregates
  • [DRILL-4264] - Allow field names to include dots
  • [DRILL-5089] - Skip initializing all enabled storage plugins for every query
  • [DRILL-5106] - Refactor SkipRecordsInspector to exclude check for predefined file formats
  • [DRILL-5259] - Allow listing a user-defined number of profiles
  • [DRILL-5338] - Web UI: add better visual indication for PLANNING, QUEUED, EXECUTION in Query Profile that they are part of total time duration
  • [DRILL-5704] - Improve error message on client side when queries fail with "Failed to create schema tree." when Impersonation is enabled and logins are anonymous
  • [DRILL-5709] - Provide a value vector method to convert a vector to nullable
  • [DRILL-5716] - Queue-based memory assignment for buffering operators
  • [DRILL-5726] - Support Impersonation without authentication for REST API
  • [DRILL-5795] - Filter pushdown for parquet handles multi rowgroup file
  • [DRILL-5808] - Reduce memory allocator strictness for "managed" operators
  • [DRILL-5815] - Provide option to set query memory as percent of total
  • [DRILL-5832] - Migrate OperatorFixture to use SystemOptionManager rather than mock
  • [DRILL-5834] - Add Networking Functions
  • [DRILL-5842] - Refactor and simplify the fragment, operator contexts for testing
  • [DRILL-5862] - Update project parent pom xml to the latest ASF version
  • [DRILL-5893] - Maven forkCount property is too aggressive causing builds to fail on some machines.
  • [DRILL-5899] - Simple pattern matchers can work with DrillBuf directly
  • [DRILL-5909] - need new JMX metrics for (FAILED and CANCELED) queries
  • [DRILL-5910] - Logging exception when custom AuthenticatorFactory not found
  • [DRILL-5921] - Counters metrics should be listed in table
  • [DRILL-5943] - Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
  • [DRILL-5952] - Implement "CREATE TABLE IF NOT EXISTS"
  • [DRILL-5960] - Add function STAsGeoJSON to extend GIS support
  • [DRILL-5962] - Add function STAsJSON to extend GIS support
  • [DRILL-5964] - Do not allow queries to access paths outside the current workspace root
  • [DRILL-5980] - Make queryType param for REST API case insensitive
  • [DRILL-5981] - Add Syntax Highlighting and Error Checking to Storage Plugin Config Page
  • [DRILL-5989] - Run Smoke Tests On Travis

New Feature

  • [DRILL-4286] - Have an ability to put server in quiescent mode of operation
  • [DRILL-4779] - Kafka storage plugin support
  • [DRILL-5337] - OpenTSDB storage plugin
  • [DRILL-5431] - Support SSL
  • [DRILL-5682] - Apache Drill should support network encryption
  • [DRILL-5723] - Support System/Session Internal Options And Additional Option System Fixes

Task

  • [DRILL-5601] - Rollup of External Sort memory management fixes
  • [DRILL-5685] - Provide a way to set common environment variable between sqlline and Drillbit differently.
  • [DRILL-5758] - Rollup of external sort fixes to issues found by QA
  • [DRILL-5772] - Enable UTF-8 support in query string by default
  • [DRILL-5781] - Fix unit test failures to use tests config even if default config is available
  • [DRILL-5820] - Add support for libpam4j Pam Authenticator
  • [DRILL-5911] - Upgrade esri-geometry-api version to 2.0.0 to avoid dependency on org.json library
  • [DRILL-5917] - Ban org.json:json library in Drill
  • [DRILL-6000] - Graceful Shutdown Unit Tests Should Not Be Run On Travis
  • [DRILL-6011] - Update log info message in DrillClient