Mongo Metastore
Introduced in release: 1.20.
The Mongo Metastore implementation allows you store Drill Metastore metadata in a configured MongoDB.
Configuration
Currently, the Mongo Metastore is not the default implementation.
To enable the Mongo Metastore, create the drill-metastore-override.conf
file
in your config directory and specify the Mongo Metastore class:
drill.metastore: {
implementation.class: "org.apache.drill.metastore.mongo.MongoMetastore"
}
Connection properties
Use the connection properties to specify how Drill should connect to your Metastore database.
drill.metastore.mongo.connection
- connection url to your MongoDB. Required.
drill.metastore.mongo.database
- database used. Optional, default is “meta”.
drill.metastore.mongo.table_collection
- collection used to store metadata for tables. Optional, default is “tables”.
Custom configuration
drill-metastore-override.conf
is used to customize connection details to the Drill Metastore database.
See drill-metastore-override-example.conf
for more details.
Example of configuration
drill.metastore: {
implementation.class: "org.apache.drill.metastore.mongo.MongoMetastore",
mongo: {
connection: "mongodb://localhost:27017/",
database: "meta",
table_collection: "tables"
}
}
Note: If your MongoDB enabled access control, make sure the user can read and write the collection used.
If you are using a sharded MongoDB cluster, make sure the database used is enabled sharding,
and the collection used is only sharded by _id
in hash mode.
Tables structure
The Drill Metastore stores several types of metadata, called components. Currently, only the tables
component is implemented.
The tables
component provides metadata about Drill tables, including their segments, files, row groups and partitions.
In Drill tables
component unit is represented by TableMetadataUnit
class which is applicable to any metadata type.
The TableMetadataUnit
class holds fields for all five metadata types within the tables
component.
Any fields not applicable to a particular metadata type are simply ignored and remain unset.
In the Mongo implementation of the Drill Metastore, all metadata of the tables component stored in one collection. The database and collection will be auto created when write data firstly into it if not exist, but you need to configure the database and collection before used as the note specified above if you are using a sharded cluster.