什么是 Drill

Apache Drill是一款开源的数据探索工具,一个分布式SQL查询和分析引擎。它包含了很多专有的设计,来进行高性能分析,支持半结构化数据源(JSON、XML和日志等)和基于应用不断创新的数据格式。在此基础上,Drill不仅支持行业标准的 ANSI SQL,做到开箱即用和快速上手,还支持大数据生态的集成,如 Apache Hive 和 Apache Hbase 等存储系统,即插即用的部署方式。

Apache Drill 关键特性

  • 低延迟的SQL查询。
  • 直接对自描述数据进行动态查询而无需提前定义Schema,如 JSON、Parquet、TEXT 和 Hbase等。
  • 行业标准的查询语法,ANSI SQL。
  • 支持嵌套数据结构。
  • 支持集成Hive。能够查询Hive表和视图,支持所有的Hive数据格式和UDFs。
  • 支持标准的JDBC和ODBC驱动连接BI工具。

快速链接

如果您从未使用过Drill,那么推荐先访问以下资源:

Apache Drill 1.19 新特性

What’s New in Apache Drill 1.18

What’s New in Apache Drill 1.17

  • DRILL-6540 - Upgrade to HADOOP-3.0 libraries. The hadoop-winutils version that worked for previous releases does not work with Drill 1.17 and later. Use the hadoop-winutils version provided with Drill 1.17 or use custom hadoop-winutils built for Hadoop 3.2.0.
  • DRILL-6739 - Update Kafka libs to 2.0.0+ version
  • DRILL-7401 - Upgrade to Sqlline 1.9
  • DRILL-7200 - Update Calcite to 1.19.0 / 1.20.0
  • DRILL-5674 - Support for .zip compression
  • DRILL-6835 - Schema provision using File / Table Function
  • DRILL-7337 - Support for vararg UDFs
  • DRILL-7096 - Develop vector for canonical Map<K,V>
  • DRILL-7343 - User-Agent UDFs added

Hive complex types support:

New format plugins support:

  • DRILL-4303 - ESRI Shapefile (shp) format plugin
  • DRILL-7177 - Format Plugin for Excel Files
  • DRILL-6096 - Provide mechanisms to specify field delimiters and quoted text for TextRecordWriter
  • Parquet format improvements, including runtime row group pruning (DRILL-7062), empty parquet creation (DRILL-7156), reading (DRILL-4517) support, and more.

Metastore support:

  • DRILL-7272 - Implement Drill Iceberg Metastore plugin
  • DRILL-7273 - Create operator for handling metadata
  • DRILL-7357 - Expose Drill Metastore data through INFORMATION_SCHEMA

What’s New in Apache Drill 1.16

What’s New in Apache Drill 1.15

What’s New in Apache Drill 1.14

What’s New in Apache Drill 1.13

  • JDK 8 support. (DRILL-1491)
  • Upgrade to Calcite version 1.15. (DRILL-3993)
  • JDBC Statement.setQueryTimeout(int) support to cancel queries if they do not complete within the specified time. (DRILL-3640)
  • Batch processing improvements that enable you to limit the amount of memory that the Flatten, Merge Join, and External Sort operators allocate to outgoing batches. (DRILL-6123)
  • Enhanced DESCRIBE command. (DRILL-4559)
  • Support for SPNEGO to extend Kerberos to Web applications through HTTP. (DRILL-5425)
  • Ability to run Drill under YARN. (DRILL-1170)
  • Parquet filter pushdown support for IS [NOT] NULL, TRUE, and FALSE operators and implicit and explicit casts for timestamp, date, and time data types. (DRILL-6174)
  • Performance improvements with support for project push down, filter push down, and partition pruning on dynamically expanded columns when represented as a star in the ITEM operator. (DRILL-6118)
  • Updated Hive libraries and the Drill Hive client updated to 2.3.2 with support for querying Hive transactional ORC bucketed tables. (DRILL-5978)
  • Ability to automatically manage memory allocations during Drill startup. (DRILL-5741)
  • Ability to query an empty directory and use it for queries with any JOIN and UNION (UNION ALL) operators. (Drill-4185)
  • Non-numeric support for JSON processing. (Drill-5919)
  • New options to that enable you to configure the number of Jetty acceptors and selectors (DRILL-5994)
  • Support SQL syntax highlighting of queries, auto-complete support in SQL editors, and snippets. (DRILL-5868)
  • Improved performance of the Single Merge Exchange operator. (DRILL-6115)
  • Like operator optimization. DRILL-5879
  • User/Distribution-specific configuration checks during startup (DRILL-5741).

What’s New in Apache Drill 1.12

Drill 1.12 provides the following new features and improvements:

  • Kafka and OpenTSDB storage plugins (DRILL-4779, DRILL-5337)
  • SSL/TLS support (DRILL-5431)
  • Network encryption support (DRILL-5682)
  • Queue-based memory assignment for buffering operators (DRILL-5716)
  • A collection of networking functions that facilitate network analysis using Drill (DRILL-5834)
  • Support for the libpam4j PAM authenticator (DRILL-5820)
  • Filter pushdown for Parquet can handle files with multiple rowgroups (DRILL-5795)
  • UTF-8 is enabled in the query string by default (DRILL-5772)
  • IF NOT EXISTS support for CREATE TABLE and CREATE VIEWS (DRILL-5952)
  • Geometry functions, ST_AsGeoJSON and ST_AsJSON, that return GeoJSON and JSON representations (DRILL-5962, DRILL-5960)
  • JMX metrics for failed and canceled queries (DRILL-5909)
  • Syntax highlighting and error checking for storage plugin configurations (DRILL-5981)
  • System options improvements, including a new internal system options table (DRILL-5723)
  • Ability to prevent users from accessing a path outside the current workspace (DRILL-5964)
  • Ability to put the server in quiescent mode for a graceful shutdown (DRILL-4286)
  • The Drill Web UI lists the completion of successfully completed queries as “successful” (DRILL-5923)

What’s New in Apache Drill 1.11

Drill 1.11 provides the following new features and improvements:

  • Cryptography-related functions. (DRILL-5634)
  • Spill to disk for the hash aggregate operator. (DRILL-5457)
  • Format plugin support for PCAP files. (DRILL-5432)
  • Ability to change the HDFS block Size for Parquet files. (DRILL-5379)
  • Ability to store query profiles in memory. (DRILL-5481)
  • Configurable CTAS directory and file permissions option. (DRILL-5391)
  • Support for network encryption. (DRILL-4335)
  • Relative paths stored in the metadata file. (DRILL-3867)
  • Support for ANSI_QUOTES. (DRILL-3510)

What’s New in Apache Drill 1.10

Drill 1.10 provides the following new features and improvements:

  • Support for the CREATE TEMPORARY TABLE AS (CTTAS) command.
  • A JDBC connection option that improves fault tolerance when connecting directly to a Drill node from a client.
  • The Web UI displays the Drill version and additional query profile statistics.
  • Drill implicitly interprets the INT96 timestamp data type in Parquet files.
  • Support for Kerberos authentication between the client and drillbit.

What’s New in Apache Drill 1.9

Drill 1.9 provides the following new features:

  • Asynchronous Parquet reader
  • Parquet filter pushdown
  • Dynamic UDF support
  • HTTPD format plugin

What’s New in Apache Drill 1.8

Drill 1.8 provides the following new features and changes:

  • Metadata cache pruning
  • IF EXISTS parameter with the DROP TABLE and DROP VIEW commands
  • DESCRIBE SCHEMA command
  • Multi-byte delimiter support
  • New parameters for filter selectivity estimates
  • Changes to the configuration and launch scripts - See Configuration and Launch Script Changes

What’s New in Apache Drill 1.7

Drill 1.7 provides the following new features:

  • Monitoring via JMX
  • Hive CHAR data type support
  • HBase 1.x support

What’s New in Apache Drill 1.6

Drill 1.6 provides the following new features:

  • Inbound impersonation
  • Additional custom window frames

What’s New in Apache Drill 1.5

Drill 1.5 provides the following new features:

  • Authentication and security for the Web interface and REST API
  • Experimental query support for Apache Kudu (incubating)
  • An improved memory allocator
  • Configurable caching for Hive metadata

What’s New in Apache Drill 1.4

Drill 1.4 introduces the following improvements:

  • select with options that you use in queries to change storage plugin settings
  • Improved behavior when parsing CSV file header names
  • A variable to set non-pretty, such as compact, printing of JSON
  • Better drillbit.log files that include query text

Drill 1.4 fixes an error that occurred when you query a Hive table using the HBaseStorageHandler (DRILL-3739). To successfully query a Hive table using the HBaseStorageHandler, you need to configure the Hive storage plugin as described in the Hive storage plugin documentation.

What’s New in Apache Drill 1.3

This releases fix issues and add a number of enhancements, including the following ones:

What’s New in Apache Drill 1.2

This release of Drill fixes many issues and introduces a number of enhancements, including the following ones:

What’s New in Apache Drill 1.1

Many enhancements in Apache Drill 1.1 include the following key features:

What’s New in Apache Drill 1.0

Apache Drill 1.0 offers the following new features:

  • Many performance planning and execution improvements.
  • Updated Drill shell now formats query results.
  • Query audit logging for getting the query history on a Drillbit.
  • Improved connection handling.
  • New Errors tab in the Query Profiles UI that facilitates troubleshooting and distributed storing of profiles.
  • Support for a new storage plugin input format: Avro

In this release, Drill disables the DECIMAL data type, including casting to DECIMAL and reading DECIMAL types from Parquet and Hive. You can enable the DECIMAL type, but this is not recommended.