Class FileSelection

java.lang.Object
org.apache.drill.exec.store.dfs.FileSelection
All Implemented Interfaces:
DrillTableSelection
Direct Known Subclasses:
IcebergMetadataFileSelection

public class FileSelection extends Object implements DrillTableSelection
Jackson serializable description of a file selection.
  • Field Details

    • files

      public List<org.apache.hadoop.fs.Path> files
    • selectionRoot

      public final org.apache.hadoop.fs.Path selectionRoot
      root path for the selections
    • cacheFileRoot

      public final org.apache.hadoop.fs.Path cacheFileRoot
      root path for the metadata cache file (if any)
  • Constructor Details

    • FileSelection

      public FileSelection(List<org.apache.hadoop.fs.FileStatus> statuses, List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path selectionRoot)
      Creates a selection out of given file statuses/files and selection root.
      Parameters:
      statuses - list of file statuses
      files - list of files
      selectionRoot - root path for selections
    • FileSelection

      public FileSelection(List<org.apache.hadoop.fs.FileStatus> statuses, List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path selectionRoot, org.apache.hadoop.fs.Path cacheFileRoot, boolean wasAllPartitionsPruned)
    • FileSelection

      public FileSelection(List<org.apache.hadoop.fs.FileStatus> statuses, List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path selectionRoot, org.apache.hadoop.fs.Path cacheFileRoot, boolean wasAllPartitionsPruned, org.apache.drill.exec.store.dfs.FileSelection.StatusType dirStatus)
    • FileSelection

      protected FileSelection(FileSelection selection)
      Copy constructor for convenience.
  • Method Details

    • getSelectionRoot

      public org.apache.hadoop.fs.Path getSelectionRoot()
    • getStatuses

      public List<org.apache.hadoop.fs.FileStatus> getStatuses(DrillFileSystem fs) throws IOException
      Throws:
      IOException
    • getFiles

      public List<org.apache.hadoop.fs.Path> getFiles()
    • containsDirectories

      public boolean containsDirectories(DrillFileSystem fs) throws IOException
      Throws:
      IOException
    • minusDirectories

      public FileSelection minusDirectories(DrillFileSystem fs) throws IOException
      Throws:
      IOException
    • selectAnyFile

      public FileSelection selectAnyFile(DrillFileSystem fs) throws IOException
      Throws:
      IOException
    • getFirstPath

      public org.apache.hadoop.fs.FileStatus getFirstPath(DrillFileSystem fs) throws IOException
      Throws:
      IOException
    • setExpandedFully

      public void setExpandedFully()
    • isExpandedFully

      public boolean isExpandedFully()
    • setExpandedPartial

      public void setExpandedPartial()
    • isExpandedPartial

      public boolean isExpandedPartial()
    • getDirStatus

      public org.apache.drill.exec.store.dfs.FileSelection.StatusType getDirStatus()
    • wasAllPartitionsPruned

      public boolean wasAllPartitionsPruned()
    • create

      public static FileSelection create(DrillFileSystem fs, String parent, String path, boolean allowAccessOutsideWorkspace) throws IOException
      Throws:
      IOException
    • create

      public static FileSelection create(List<org.apache.hadoop.fs.FileStatus> statuses, List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path root, org.apache.hadoop.fs.Path cacheFileRoot, boolean wasAllPartitionsPruned)
      Creates a selection with the given file statuses/files and selection root.
      Parameters:
      statuses - list of file statuses
      files - list of files
      root - root path for selections
      cacheFileRoot - root path for metadata cache (null for no metadata cache)
      Returns:
      null if creation of FileSelection fails with an IllegalArgumentException otherwise a new selection.
      See Also:
    • create

      public static FileSelection create(List<org.apache.hadoop.fs.FileStatus> statuses, List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path root)
    • createFromDirectories

      public static FileSelection createFromDirectories(List<org.apache.hadoop.fs.Path> dirPaths, FileSelection selection, org.apache.hadoop.fs.Path cacheFileRoot)
    • checkBackPaths

      public static void checkBackPaths(String parent, String combinedPath, String subpath)
      Check if the path is a valid sub path under the parent after removing backpaths. Throw an exception if it is not. We pass subpath in as a parameter only for the error message
      Parameters:
      parent - The parent path (the workspace directory).
      combinedPath - The workspace directory and (relative) subpath path combined.
      subpath - For error message only, the subpath
    • getFileStatuses

      public List<org.apache.hadoop.fs.FileStatus> getFileStatuses()
    • supportsDirPruning

      public boolean supportsDirPruning()
    • setHadWildcard

      public void setHadWildcard(boolean wc)
    • hadWildcard

      public boolean hadWildcard()
    • getCacheFileRoot

      public org.apache.hadoop.fs.Path getCacheFileRoot()
    • setMetaContext

      public void setMetaContext(MetadataContext context)
    • getMetaContext

      public MetadataContext getMetaContext()
    • isEmptyDirectory

      public boolean isEmptyDirectory()
      Returns:
      true if this selectionRoot points to an empty directory, false otherwise
    • setEmptyDirectoryStatus

      public void setEmptyDirectoryStatus()
      Setting emptyDirectory as true allows to identify this selectionRoot as an empty directory
    • digest

      public String digest()
      Description copied from interface: DrillTableSelection
      The digest of the selection represented by the implementation. The selections that accompany Tables can modify the contained dataset, e.g. a file selection can restrict to a subset of the available data and a format selection can include options that affect the behaviour of the underlying reader. Two scans will end up being considered identical during logical planning if their digests are the same so selection implementations should override this method so that exactly those scans that really are identical (in terms of the data they produce) have matching digests.
      Specified by:
      digest in interface DrillTableSelection
      Returns:
      this selection's digest, normally a string built from its properties.
    • toString

      public String toString()
      Overrides:
      toString in class Object