Workspaces
You can define one or more workspaces in a storage plugin configuration. The workspace defines the location of files in subdirectories of a local or distributed file system. Drill searches the workspace to locate data when
you run a query. A hidden default workspace, dfs.default
, points to the root of the file system.
The following dfs
storage plugin configuration shows some examples of defined workspaces:
{
"type": "file",
"enabled": true,
"connection": "file:///",
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
},
"emp": {
"location": "/Users/user1/emp",
"writable": true,
"defaultInputFormat": null
},
"donuts": {
"location": "/Users/user1/donuts",
"writable": true,
"defaultInputFormat": null
},
"sales": {
"location": "/Users/user1/sales",
"writable": true,
"defaultInputFormat": null
}
}, Configuring workspaces to include a subdirectory simplifies the query, which is important when querying the same files repeatedly. After you configure a long path name in the workspace `location` property, instead of using the full path name to the data source, you use dot notation in the FROM clause.
<workspace name>.`<location>
`
Where <location>
is the path name of a subdirectory, such as /users/max/drill/json
enclosed in double quotation marks as shown in the “Querying Donuts Example.”
To query the data source when you have not set the default schema name to the storage plugin configuration, include the plugin name. This syntax assumes you did not issue a USE statement to connect to a storage plugin that defines the location of the data:
<plugin>.<workspace name>.`<location>
`
Overriding dfs.default
You may want to override the hidden default workspace in scenarios where users do not have permissions to access the root directory.
Add the following workspace entry to the dfs
storage plugin configuration to override the default workspace:
"default": {
"location": "</directory/path>",
"writable": true,
"defaultInputFormat": null
}
No Workspaces for Hive and HBase
You cannot include workspaces in the configurations of the
hive
and hbase
plugins installed with Apache Drill, though Hive databases show up as workspaces in
Drill. Each hive
storage plugin configuration includes a default
workspace that points to the Hive metastore. When you query
files and tables in the hive default
workspaces, you can omit the
workspace name from the query.
For example, you can issue a query on a Hive table in the default workspace
using either of the following queries and get the same results:
Example
SELECT * FROM hive.customers LIMIT 10;
SELECT * FROM hive.`default`.customers LIMIT 10;
Note
Default is a reserved word. You must enclose reserved words when used as identifiers in back ticks.
Because the HBase storage plugin does not accommodate a workspace, you can use the following query:
SELECT * FROM hbase.customers LIMIT 10;