Package org.apache.drill.exec.store.hive
Class HiveMetadataProvider
java.lang.Object
org.apache.drill.exec.store.hive.HiveMetadataProvider
Class which provides methods to get metadata of given Hive table selection. It tries to use the stats stored in
MetaStore whenever available and delays the costly operation of loading of InputSplits until needed. When
loaded, InputSplits are cached to speedup subsequent access.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Contains stats.static class
Contains group of input splits along with the partition. -
Field Summary
-
Constructor Summary
ConstructorDescriptionHiveMetadataProvider
(String userName, HiveReadEntry hiveReadEntry, org.apache.hadoop.hive.conf.HiveConf hiveConf) -
Method Summary
Modifier and TypeMethodDescriptiongetInputDirectories
(HiveReadEntry hiveReadEntry) Get the list of directories which contain the input files.getInputSplits
(HiveReadEntry hiveReadEntry) ReturnHiveMetadataProvider.LogicalInputSplit
s for givenHiveReadEntry
.getStats
(HiveReadEntry hiveReadEntry) Return stats for table/partitions in givenHiveReadEntry
.
-
Field Details
-
RECORD_SIZE
public static final int RECORD_SIZE- See Also:
-
-
Constructor Details
-
HiveMetadataProvider
public HiveMetadataProvider(String userName, HiveReadEntry hiveReadEntry, org.apache.hadoop.hive.conf.HiveConf hiveConf)
-
-
Method Details
-
getStats
Return stats for table/partitions in givenHiveReadEntry
. If valid stats are available in MetaStore, return it. Otherwise estimate using the size of the input data.- Parameters:
hiveReadEntry
- Subset of theHiveReadEntry
used when creating this cache object.- Returns:
- hive statistics holder
- Throws:
IOException
- if was unable to retrieve table statistics
-
getInputSplits
ReturnHiveMetadataProvider.LogicalInputSplit
s for givenHiveReadEntry
. First splits are looked up in cache, if not found go throughInputFormat.getSplits(JobConf, int)
to find the splits.- Parameters:
hiveReadEntry
- Subset of theHiveReadEntry
used when creating this object.- Returns:
- list of logically grouped input splits
-
getInputDirectories
Get the list of directories which contain the input files. This list is useful for explain plan purposes.- Parameters:
hiveReadEntry
-HiveReadEntry
containing the input table and/or partitions.
-