Using Apache Drill with Tableau 10.2
Author: Andries Engelbrecht, Partner Systems Engineer, MapR Technologies
The powerful combination of Drill and Tableau enables you to easily and directly work with various data formats and sources, elevating data self-service and discovery to a new level.
Drill 1.10 fully supports Tableau Level of Detail (LoD) calculations and Tableau Sets for an enhanced user experience. Drill can also be used as a data source for Tableau Desktop on Mac, in addition to Tableau Desktop and Tableau Server on Windows.
This document describes how to connect Tableau 10.2 to Apache Drill and instantly explore multiple data formats from various data sources.
Prerequisites
Your system must meet the following prerequisites before you can complete the steps required to connect Tableau 10.2 to Apache Drill:
- Tableau 10.2 or later
- Apache Drill 1.10 or later
- MapR Drill ODBC Driver v1.3.0 or later
Required Steps
Complete the following steps to use Apache Drill with Tableau 10.2:
- Install and Configure the MapR Drill ODBC Driver.
- Connect Tableau to Drill (using the Apache Drill Data Connector).
- Query and Analyze the Data (various data formats with Tableau and Drill).
Step 1: Install and Configure the MapR Drill ODBC Driver
Drill uses standard ODBC connectivity to provide you with easy data exploration capabilities on complex, schema-less data sets.
To install and configure the ODBC driver, complete the following steps:
- Download the latest 64-bit MapR Drill ODBC Driver for Mac or Windows at: http://package.mapr.com/tools/MapR-ODBC/MapR_Drill/
- Refer to the instructions appropriate for your system to install the ODBC driver: * Windows * Mac
Important: Verify that the Tableau client system can resolve the hostnames for the Drill and Zookeeper nodes correctly. See the System Requirements section of the ODBC Mac or Windows installation page for instructions.
Step 2: Connect Tableau to Drill
To connect Tableau to Drill, complete the following steps:
Note: The Tableau documentation provides additional details, if needed.
- In a Tableau Workbook, click on Data > New Data Source.
- Select Apache Drill.
- Choose whether to connect directly or using ZooKeeper. For production environments, connecting to ZooKeeper is recommended for resiliency and distribution of connection management on the Drill cluster.
- Enter the ZooKeeper Quorum/Drill Cluster ID or Drill Direct Server and Port information.
- Enter the authentication information.
- Click Sign In. Tableau connects to Drill and allows you to select various Tables and Views.
- Click on the Schema drop-down list to display all available Drill schemas. When you select a schema, Tableau displays available tables or views. You can select the tables and views to build a Tableau Visualization. Additionally, you can use custom SQL by clicking on the New Custom SQL option.
Note: Tableau can natively work with Hive tables and Drill views. You can use custom SQL or create a view in Drill to represent the complex data in Drill data sources, such as data in files or HBase/MapR-DB tables, to Tableau. For more information, see Tableau Examples.
Step 3: Query and Analyze the Data
Tableau can now use Drill to query various data sources and visualize the information, as shown in the following example.
Example
A retailer has order data in CSV files on the distributed file system, product data in HBase, and customer data in Hive. The retailer wants to see the average order total by customer for each state (Tableau LoD), as well as the total number of orders and average revenue by customer for the top 5 states by revenue (Tableau Set).
To find this information, the retail business analyst completes the following steps:
- Creates a LoD calculation for ordertotal by customer id.
- Displays the states and the average revenue by customer for each state.
- Creates a graph with the total revenue by state, ordered by highest revenue.
- Drags and selects the top 5 states, right-clicks, and selects Create Set.
- Enters a name for the set.
- Creates a table showing the total number of orders and average revenue by customer for the top 5 states by revenue and also shows the number for the states not in the top 5.
You have completed the tutorial for configuring Tableau 10.2 to work with Apache Drill. For additional support, see https://www.tableau.com/support/drivers.