Drill 1.4 Released - Apache Drill

Apache Drill 1.4 (available here) includes bug fixes and enhancements from 32 JIRAs.

Here’s a list of highlights from this newest version of Drill:

Select With Options

Queries that change storage plugin configuration options can now be written. For instance, to query the file CO.dat, the following can be used:

SELECT * FROM TABLE(dfs.`/path/to/CO.dat`(type => 'text'));

If a version of CO.dat with a header is available, the first entries of the file can be parsed as column names by passing an extractHeader => true argument. We can also use a pipe symbol, ‘|’, as the delimiter by passing fieldDelimiter:

SELECT * FROM TABLE(dfs.`/path/to/CO.dat`(type => 'text', fieldDelimiter => '|', extractHeader => true));

Additionally, lineDelimiter can be used to indicate a deliminter for new lines, such as the double pipe, ‘ ’, symbol in this example:

SELECT * FROM TABLE(dfs.`/path/to/CO.dat`(type => 'text', lineDelimiter => '||', fieldDelimiter => '|'));

Improved Behavior For CSV Header Parsing

When header parsing is enabled, queries to CSV files no longer raise an exception if the indicated column does not exist. Instead, Drill now returns null values for that column.

JSON Formatting

For more compact results, Drill’s default behavior of pretty-printing JSON can now be changed by setting the variable store.json.writer.uglify to true. As in:

ALTER SESSION SET store.json.writer.uglify = true;

Better Logging

SQL query text is now logged to the drillbit.log file.

Other Improvements

This version also features schema change compatible sorting, better Apache Hive support, and more efficient caching for Parquet file metadata.