public class MetadataDirectGroupScan extends DirectGroupScan
Represents direct scan based on metadata information. For example, for parquet files it can be obtained from parquet footer (total row count) or from parquet metadata files (column counts). Contains reader, statistics and list of scanned files if present.
    • MetadataDirectGroupScan

      public MetadataDirectGroupScan(RecordReader reader, org.apache.hadoop.fs.Path selectionRoot, int numFiles, ScanStats stats, boolean usedMetadataSummaryFile, boolean usedMetastore)
    • getSelectionRoot

      public org.apache.hadoop.fs.Path getSelectionRoot()
      Returns path to the selection root. If this GroupScan cannot provide selection root, it returns null.
      path to the selection root
    • getNewWithChildren

      public PhysicalOperator getNewWithChildren(List<PhysicalOperator> children)
      Regenerate with this node with a new set of children. This is used in the case of materialization or optimization.
    • clone

      public GroupScan clone(List<SchemaPath> columns)
      Returns a clone of GroupScan instance, except that the new GroupScan will use the provided list of columns .
    • getDigest

      public String getDigest()

      Returns string representation of group scan data. Includes selection root, number of files, if metadata summary file was used, such data is present.

      Example: [selectionRoot = [/tmp/users], numFiles = 1, usedMetadataSummaryFile = false, usedMetastore = true]

      string representation of group scan data