Apache Drill 1.3.0 Release Notes
Release date: November 22, 2015
Today, we’re happy to announce the availability of Drill 1.3.0, providing bug fixes and enhancements.
Enhancements and Bug Fixes
Sub-task
- [DRILL-1721] - Configure fmpp-maven-plugin for incremental build
- [DRILL-3313] - Eliminate redundant #load methods and unit-test loading & exporting of vectors
Bug
- [DRILL-1752] - Drill cluster returns error when querying Mongo shards on an unsharded collection
- [DRILL-2161] - Flatten on a list within a list on a large data set results in an IOB Exception
- [DRILL-2583] - Querying a non-existent table from hbase should throw a proper error message
- [DRILL-2626] - org.apache.drill.common.StackTrace seems to have duplicate code; should we re-use Throwable's code?
- [DRILL-2967] - Incompatible types error reported in a "not in" query with compatible data types
- [DRILL-3336] - to_date(to_timestamp) with group-by in hbase/maprdb table fails with "java.lang.UnsupportedOperationException"
- [DRILL-3428] - Errors during text filereading should provide the file name in the error messge
- [DRILL-3429] - DrillAvgVarianceConvertlet may produce wrong results while rewriting stddev, variance
- [DRILL-3485] - Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
- [DRILL-3486] - Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available
- [DRILL-3505] - MongoDB _id is returned as null when t.*, t._id is used in the projection
- [DRILL-3538] - We do not prune partitions when we count over partitioning key and filter over partitioning key
- [DRILL-3578] - UnsupportedOperationException: Unable to get value vector class for minor type [FIXEDBINARY] and mode [OPTIONAL]
- [DRILL-3634] - Hive Scan : Add fileCount (no of files scanned) or no of partitions scanned to the text plan
- [DRILL-3770] - Query with window function having just ORDER BY clause runs out of memory on large datasets
- [DRILL-3871] - Off by one error while reading binary fields with one terminal null in parquet
- [DRILL-3921] - Hive LIMIT 1 queries take too long
- [DRILL-3937] - We are not pruning when we have a metadata cache and auto partitioned data in some cases
- [DRILL-3941] - Add timing instrumentation around Partition Pruning
- [DRILL-3943] - CannotPlanException caused by ExpressionReductionRule
- [DRILL-3947] - IndexOutOfBoundsException for pruning on date column (at large scale)
- [DRILL-3952] - Improve Window Functions performance when not all batches are required to process the current batch
- [DRILL-3956] - TEXT MySQL type unsupported
- [DRILL-3975] - Partition Planning rule causes query failure due to IndexOutOfBoundsException on HDFS
- [DRILL-3980] - Build failure in -Pmapr profile (due to DRILL-3749)
- [DRILL-3992] - Unable to query Oracle DB using JDBC Storage Plug-In
- [DRILL-3994] - Build Fails on Windows after DRILL-3742
- [DRILL-4000] - In all non-root fragments, Drill recreates storage plugin instances for every minor fragment
- [DRILL-4025] - Reduce getFileStatus() invocation for Parquet by 1
- [DRILL-4028] - Merge Drill parquet modifications back into the mainline project
- [DRILL-4040] - Build failure on master
- [DRILL-4042] - Unable to run sqlline in embedded mode on Windows
- [DRILL-4046] - Performance regression in some tpch queries with 1.3rc0 build
- [DRILL-4056] - Avro deserialization corrupts data
- [DRILL-4065] - ImpersonationUtil always creates new UserGroupInformation (thus new FileSystem objects), causing excessive number of threads
- [DRILL-4070] - Files written with versions of Drill before v1.3 record metadata that is indistinguishable from bad metadata from other Parquet creators
- [DRILL-4071] - Partition pruning fails when a Coalesce() function appears with partition filter
- [DRILL-4080] - doc file deleted from gh-pages appears when obsolete url is used
- [DRILL-4085] - Disable RPC Offload until concurrency bugs are tracked down
Improvement
- [DRILL-1065] - Provide a reset command to reset an option to its default value
- [DRILL-2726] - Display Drill version in sys.version
- [DRILL-3242] - Enhance RPC layer to offload all request work onto a separate thread.
- [DRILL-3340] - Add named metrics and named operators in OperatorProfile
- [DRILL-3742] - Improve classpath scanning to reduce the time it takes
- [DRILL-3793] - Rewrite MergeJoinBatch using record batch iterator.
- [DRILL-3810] - Filesystem plugin's support for file format's schema
- [DRILL-3911] - Upgrade Hadoop from 2.4.1 to latest stable
- [DRILL-3912] - Common subexpression elimination in code generation
- [DRILL-3914] - Support geospatial queries
- [DRILL-4031] - JDBC Plugin Queries fail if columns return JDBC OTHER type
- [DRILL-4103] - Add additional metadata to Parquet files generated by Drill
New Feature
- [DRILL-951] - CSV header row should be parsed
- [DRILL-3749] - Upgrade Hadoop dependency to latest version (2.7.1)
- [DRILL-3802] - Throw unsupported error for ROLLUP/GROUPING
- [DRILL-3963] - Read raw key value bytes from sequence files
Test
- [DRILL-3983] - Small test improvements