MongoDB Storage Plugin
Overview
Drill supports MongoDB 3.0, providing a mongodb storage plugin to connect to MongoDB using MongoDB’s latest Java driver. You can run queries to read, but not write, Mongo data using Drill. Attempting to write data back to Mongo results in an error. You do not need any upfront schema definitions.
Note
In the following examples, you use a local instance of Drill for simplicity.
You can also run Drill and MongoDB together in distributed mode.
Before You Begin
To query MongoDB with Drill, you install Drill and MongoDB, and then you import zip code aggregation data into MongoDB.
- Install Drill, if you do not already have it installed.
- Install MongoDB, if you do not already have it installed.
- Import the MongoDB zip code sample data set. * Copy the
zips.json
content into a file and save it.- Create
/data/db
if it doesn’t already exist. - Make sure you have permissions to access the directories.
- Use Mongo Import to import
zips.json
.
- Create
Configuring MongoDB
Drill must be running in order to access the Web UI to configure a storage plugin configuration. Start Drill and view and enable the MongoDB storage plugin configuration as described in the following procedure:
-
The Drill shell needs to be running to access the Drill Web UI.
- In the Drill Web UI, select the Storage tab.
- Under Disabled Storage Plugins, select Update to choose the
mongo
storage plugin configuration. - In the Configuration window, take a look at the default configuration:
{ "type": "mongo", "connection": "mongodb://localhost:27017/", "enabled": false }
Note
27017 is the default port for `mongodb` instances.
Note
In some cases you will need an authentication to perform certain `mongodb` queries. You can add login and password directly to connection URL: 'mongodb://root:password@localhost:27017/'.
Note
See more info [Connection String URI Format](https://docs.mongodb.com/v3.0/reference/connection-string/)
- Click Enable to enable the storage plugin.
Querying MongoDB
In the Drill shell, set up Drill to use the zips collection you imported into MongoDB.
-
Get a list of schemas from all Drill data sources, including MongoDB.
SHOW DATABASES; |---------------------| | SCHEMA_NAME | |---------------------| | INFORMATION_SCHEMA | | cp.default | | dfs.default | | dfs.root | | dfs.tmp | | mongo.local | | mongo.test | | sys | |---------------------| 8 rows selected (1.385 seconds)
-
Change the schema to mongo.text.
USE mongo.test; |-------|-----------------------------------------| | ok | summary | |-------|-----------------------------------------| | true | Default schema changed to [mongo.test] | |-------|-----------------------------------------|
-
List the tables and verify that the
zips
collection appears:SHOW TABLES; |---------------|-----------------| | TABLE_SCHEMA | TABLE_NAME | |---------------|-----------------| | mongo.test | system.indexes | | mongo.test | zips | |---------------|-----------------| 2 rows selected (0.187 seconds)
-
Set the option to read numbers as doubles instead of as text;
ALTER SYSTEM SET `store.mongo.read_numbers_as_double` = true; |-------|----------------------------------------------| | ok | summary | |-------|----------------------------------------------| | true | store.mongo.read_numbers_as_double updated. | |-------|----------------------------------------------| 1 row selected (0.078 seconds)
Example Queries
Example 1: View the zips Collection
SELECT * FROM zips LIMIT 10;
|---------------|-------------------------|--------|--------|
| city | loc | pop | state |
|---------------|-------------------------|--------|--------|
| AGAWAM | [-72.622739,42.070206] | 15338 | MA |
| CUSHMAN | [-72.51565,42.377017] | 36963 | MA |
| BELCHERTOWN | [-72.410953,42.275103] | 10579 | MA |
| BLANDFORD | [-72.936114,42.182949] | 1240 | MA |
| BRIMFIELD | [-72.188455,42.116543] | 3706 | MA |
| CHESTERFIELD | [-72.833309,42.38167] | 177 | MA |
| BARRE | [-72.108354,42.409698] | 4546 | MA |
| CHICOPEE | [-72.607962,42.162046] | 23396 | MA |
| CHICOPEE | [-72.576142,42.176443] | 31495 | MA |
| CHESTER | [-72.988761,42.279421] | 1688 | MA |
|---------------|-------------------------|--------|--------|
10 rows selected (0.444 seconds)
Example 2: Aggregation
SELECT city, avg(pop) FROM zips GROUP BY city LIMIT 10;
|---------------|---------------------|
| city | EXPR$1 |
|---------------|---------------------|
| AGAWAM | 15338.0 |
| CUSHMAN | 18649.5 |
| BELCHERTOWN | 10579.0 |
| BLANDFORD | 1240.0 |
| BRIMFIELD | 2441.5 |
| CHESTERFIELD | 9988.857142857143 |
| BARRE | 9770.0 |
| CHICOPEE | 27445.5 |
| CHESTER | 7285.0952380952385 |
| WESTOVER AFB | 1764.0 |
|---------------|---------------------|
10 rows selected (1.664 seconds)
Example 3: Nested Data Column Array
0: jdbc:drill:zk=local> SELECT loc FROM zips LIMIT 10;
|------------------------|
| loc |
|------------------------|
| [-72.622739,42.070206] |
| [-72.51565,42.377017] |
| [-72.108354,42.409698] |
| [-72.410953,42.275103] |
| [-72.936114,42.182949] |
| [-72.188455,42.116543] |
| [-72.988761,42.279421] |
| [-72.833309,42.38167] |
| [-72.607962,42.162046] |
| [-72.576142,42.176443] |
|------------------------|
0: jdbc:drill:zk=local> SELECT loc[0] FROM zips LIMIT 10;
|------------|
| EXPR$0 |
|------------|
| -72.622739 |
| -72.51565 |
| -72.108354 |
| -72.410953 |
| -72.936114 |
| -72.188455 |
| -72.988761 |
| -72.833309 |
| -72.607962 |
| -72.576142 |
|------------|
Using ODBC/JDBC Drivers
You can query MongoDB through standard BI tools, such as Tableau and SQuirreL. For information about Drill ODBC and JDBC drivers, refer to Drill Interfaces.