Class FileUploader

java.lang.Object
org.apache.drill.yarn.client.FileUploader
Direct Known Subclasses:
FileUploader.NonLocalized, FileUploader.ReuseFiles, FileUploader.UploadFiles

public abstract class FileUploader extends Object
Performs the file upload portion of the operation by uploading an archive to the target DFS system and directory. Records the uploaded archive so it may be used for localizing Drill in the launch step.

Some of the code is a bit of a dance so we can get information early to display in status messages.

This class handles x cases:

  1. Non-localized, config in $DRILL_HOME/conf.
  2. Non-localized, config in a site directory.
  3. Localized, config in $DRILL_HOME.
  4. Localized, config in a site directory.

The non-localized case adds complexity, but is very handy when doing development as it avoids the wait for the archives to up- and down-load. The non-localized mode is not advertised to users as it defeats one of the main benefits of YARN.

In the localized case, YARN is incomplete; there is no API to inform the AM of the set of localized files, so we pass the information along in environment variables. Also, tar is a bit annoying because it includes the root directory name when unpacking, so that the drill.tar.gz archive unpacks to, say, apache-drill.x.y.z. So, we must pass along the directory name as well.

All of this is further complicated by the way YARN needs detailed information to localize resources, and that YARN uses a "key" to identify localized resources, which becomes the directory name in the task's working folder. Thus, Drill becomes, say
$PWD/drill/apache-drill.x.y.z/bin, conf, ...
YARN provides PWD. The Drillbit launch script needs to know the next two directory names.

For efficiency, we omit uploading the Drill archive if one already exists in dfs and is the same size as the one on the client. We always upload the config archive (if needed) because config changes are likely to be one reason that someone (re)starts the Drill cluster.

  • Field Details

    • doyConfig

      protected DrillOnYarnConfig doyConfig
    • config

      protected com.typesafe.config.Config config
    • dfs

      protected DfsFacade dfs
    • dryRun

      protected boolean dryRun
    • verbose

      protected boolean verbose
    • localDrillHome

      protected File localDrillHome
    • localSiteDir

      protected File localSiteDir
    • localDrillArchivePath

      protected File localDrillArchivePath
    • resources

      public Map<String,org.apache.hadoop.yarn.api.records.LocalResource> resources
    • drillArchivePath

      public String drillArchivePath
    • siteArchivePath

      public String siteArchivePath
    • remoteDrillHome

      public String remoteDrillHome
    • remoteSiteDir

      public String remoteSiteDir
  • Constructor Details

    • FileUploader

      public FileUploader(boolean dryRun, boolean verbose)
  • Method Details