Class ParquetPartitionDescriptor

java.lang.Object
org.apache.drill.exec.planner.AbstractPartitionDescriptor
org.apache.drill.exec.planner.ParquetPartitionDescriptor
All Implemented Interfaces:
Iterable<List<PartitionLocation>>, PartitionDescriptor

public class ParquetPartitionDescriptor extends AbstractPartitionDescriptor
PartitionDescriptor that describes partitions based on column names instead of directory structure
  • Constructor Details

  • Method Details

    • getPartitionHierarchyIndex

      public int getPartitionHierarchyIndex(String partitionName)
      Description copied from interface: PartitionDescriptor
      Get the hierarchy index of the given partition For eg: if we have the partition laid out as follows 1997/q1/jan then getPartitionHierarchyIndex("jan") => 2
      Parameters:
      partitionName - Partition name
      Returns:
      the index of specified partition name in the hierarchy
    • isPartitionName

      public boolean isPartitionName(String name)
      Description copied from interface: PartitionDescriptor
      Given a column name return boolean to indicate if its a partition column or not
      Parameters:
      name - of Partition
      Returns:
      true, if this is the partition name and vise versa.
    • getIdIfValid

      public Integer getIdIfValid(String name)
      Description copied from interface: PartitionDescriptor
      Check to see if the name is a partition name.
      Parameters:
      name - The field name you want to compare to partition names.
      Returns:
      Return index if valid, otherwise return null;
    • getMaxHierarchyLevel

      public int getMaxHierarchyLevel()
      Description copied from interface: PartitionDescriptor
      Maximum level of partition nesting/ hierarchy supported
      Returns:
      maximum supported level number of partition hierarchy
    • populatePartitionVectors

      public void populatePartitionVectors(ValueVector[] vectors, List<PartitionLocation> partitions, BitSet partitionColumnBitSet, Map<Integer,String> fieldNameMap)
      Description copied from interface: PartitionDescriptor
      Creates an in memory representation of all the partitions. For each level of partitioning we will create a value vector which this method will populate for all the partitions with the values of the partitioning key
      Parameters:
      vectors - - Array of vectors in the container that need to be populated
      partitions - - List of all the partitions that exist in the table
      partitionColumnBitSet - - Partition columns selected in the query
      fieldNameMap - - Maps field ordinal to the field name
    • getVectorType

      public TypeProtos.MajorType getVectorType(SchemaPath column, PlannerSettings plannerSettings)
      Description copied from interface: PartitionDescriptor
      Method returns the Major type associated with the given column
      Parameters:
      column - - column whose type should be determined
    • getBaseTableLocation

      public org.apache.hadoop.fs.Path getBaseTableLocation()
    • createTableScan

      public org.apache.calcite.rel.core.TableScan createTableScan(List<PartitionLocation> newPartitionLocation, org.apache.hadoop.fs.Path cacheFileRoot, boolean wasAllPartitionsPruned, MetadataContext metaContext) throws Exception
      Description copied from interface: PartitionDescriptor
      Create a new TableScan rel node, given the lists of new partitions or new files to scan and a path to a metadata cache file
      Specified by:
      createTableScan in interface PartitionDescriptor
      Overrides:
      createTableScan in class AbstractPartitionDescriptor
      Throws:
      Exception
    • createTableScan

      public org.apache.calcite.rel.core.TableScan createTableScan(List<PartitionLocation> newPartitionLocation, boolean wasAllPartitionsPruned) throws Exception
      Description copied from interface: PartitionDescriptor
      Create a new TableScan rel node, given the lists of new partitions or new files to SCAN.
      Throws:
      Exception
    • createPartitionSublists

      protected void createPartitionSublists()
      Description copied from class: AbstractPartitionDescriptor
      Create sublists of the partition locations, each sublist of size at most PartitionDescriptor.PARTITION_BATCH_SIZE
      Specified by:
      createPartitionSublists in class AbstractPartitionDescriptor