Apache Drill 1.3.0 Release Notes

Release date: November 22, 2015

Today, we’re happy to announce the availability of Drill 1.3.0, providing bug fixes and enhancements.

Enhancements and Bug Fixes

Sub-task

[DRILL-1721] - Configure fmpp-maven-plugin for incremental build
[DRILL-3313] - Eliminate redundant #load methods and unit-test loading & exporting of vectors

Bug

[DRILL-1752] - Drill cluster returns error when querying Mongo shards on an unsharded collection
[DRILL-2161] - Flatten on a list within a list on a large data set results in an IOB Exception
[DRILL-2583] - Querying a non-existent table from hbase should throw a proper error message
[DRILL-2626] - org.apache.drill.common.StackTrace seems to have duplicate code; should we re-use Throwable's code?
[DRILL-2967] - Incompatible types error reported in a "not in" query with compatible data types
[DRILL-3336] - to_date(to_timestamp) with group-by in hbase/maprdb table fails with "java.lang.UnsupportedOperationException"
[DRILL-3428] - Errors during text filereading should provide the file name in the error messge
[DRILL-3429] - DrillAvgVarianceConvertlet may produce wrong results while rewriting stddev, variance
[DRILL-3485] - Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
[DRILL-3486] - Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available
[DRILL-3505] - MongoDB _id is returned as null when t.*, t._id is used in the projection
[DRILL-3538] - We do not prune partitions when we count over partitioning key and filter over partitioning key
[DRILL-3578] - UnsupportedOperationException: Unable to get value vector class for minor type [FIXEDBINARY] and mode [OPTIONAL]
[DRILL-3634] - Hive Scan : Add fileCount (no of files scanned) or no of partitions scanned to the text plan
[DRILL-3770] - Query with window function having just ORDER BY clause runs out of memory on large datasets
[DRILL-3871] - Off by one error while reading binary fields with one terminal null in parquet
[DRILL-3921] - Hive LIMIT 1 queries take too long
[DRILL-3937] - We are not pruning when we have a metadata cache and auto partitioned data in some cases
[DRILL-3941] - Add timing instrumentation around Partition Pruning
[DRILL-3943] - CannotPlanException caused by ExpressionReductionRule
[DRILL-3947] - IndexOutOfBoundsException for pruning on date column (at large scale)
[DRILL-3952] - Improve Window Functions performance when not all batches are required to process the current batch
[DRILL-3956] - TEXT MySQL type unsupported
[DRILL-3975] - Partition Planning rule causes query failure due to IndexOutOfBoundsException on HDFS
[DRILL-3980] - Build failure in -Pmapr profile (due to DRILL-3749)
[DRILL-3992] - Unable to query Oracle DB using JDBC Storage Plug-In
[DRILL-3994] - Build Fails on Windows after DRILL-3742
[DRILL-4000] - In all non-root fragments, Drill recreates storage plugin instances for every minor fragment
[DRILL-4025] - Reduce getFileStatus() invocation for Parquet by 1
[DRILL-4028] - Merge Drill parquet modifications back into the mainline project
[DRILL-4040] - Build failure on master
[DRILL-4042] - Unable to run sqlline in embedded mode on Windows
[DRILL-4046] - Performance regression in some tpch queries with 1.3rc0 build
[DRILL-4056] - Avro deserialization corrupts data
[DRILL-4065] - ImpersonationUtil always creates new UserGroupInformation (thus new FileSystem objects), causing excessive number of threads
[DRILL-4070] - Files written with versions of Drill before v1.3 record metadata that is indistinguishable from bad metadata from other Parquet creators
[DRILL-4071] - Partition pruning fails when a Coalesce() function appears with partition filter
[DRILL-4080] - doc file deleted from gh-pages appears when obsolete url is used
[DRILL-4085] - Disable RPC Offload until concurrency bugs are tracked down

Improvement

[DRILL-1065] - Provide a reset command to reset an option to its default value
[DRILL-2726] - Display Drill version in sys.version
[DRILL-3242] - Enhance RPC layer to offload all request work onto a separate thread.
[DRILL-3340] - Add named metrics and named operators in OperatorProfile
[DRILL-3742] - Improve classpath scanning to reduce the time it takes
[DRILL-3793] - Rewrite MergeJoinBatch using record batch iterator.
[DRILL-3810] - Filesystem plugin's support for file format's schema
[DRILL-3911] - Upgrade Hadoop from 2.4.1 to latest stable
[DRILL-3912] - Common subexpression elimination in code generation
[DRILL-3914] - Support geospatial queries
[DRILL-4031] - JDBC Plugin Queries fail if columns return JDBC OTHER type
[DRILL-4103] - Add additional metadata to Parquet files generated by Drill

New Feature

[DRILL-951] - CSV header row should be parsed
[DRILL-3749] - Upgrade Hadoop dependency to latest version (2.7.1)
[DRILL-3802] - Throw unsupported error for ROLLUP/GROUPING
[DRILL-3963] - Read raw key value bytes from sequence files

Test

[DRILL-3983] - Small test improvements

← Apache Drill 1.4.0 Release NotesApache Drill 1.2.0 Release Notes →