Class HiveUtilities

java.lang.Object
org.apache.drill.exec.store.hive.HiveUtilities

public class HiveUtilities extends Object
  • Constructor Details

    • HiveUtilities

      public HiveUtilities()
  • Method Details

    • convertPartitionType

      public static Object convertPartitionType(org.apache.hadoop.hive.serde2.typeinfo.TypeInfo typeInfo, String value, String defaultPartitionValue)
      Partition value is received in string format. Convert it into appropriate object based on the type.
      Parameters:
      typeInfo - type info
      value - partition values
      defaultPartitionValue - default partition value
      Returns:
      converted object
    • populateVector

      public static void populateVector(ValueVector vector, DrillBuf managedBuffer, Object val, int start, int end)
      Populates vector with given value based on its type.
      Parameters:
      vector - vector instance
      managedBuffer - Drill duffer
      val - value
      start - start position
      end - end position
    • getMajorTypeFromHiveTypeInfo

      public static TypeProtos.MajorType getMajorTypeFromHiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.TypeInfo typeInfo, OptionSet options)
      Obtains major type from given type info holder.
      Parameters:
      typeInfo - type info holder
      options - session options
      Returns:
      appropriate major type, null otherwise. For some types may throw unsupported exception.
    • getMinorTypeFromHivePrimitiveTypeInfo

      public static TypeProtos.MinorType getMinorTypeFromHivePrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo primitiveTypeInfo, OptionSet options)
      Obtains minor type from given primitive type info holder.
      Parameters:
      primitiveTypeInfo - primitive type info holder
      options - session options
      Returns:
      appropriate minor type, otherwise throws unsupported type exception
    • getInputFormatClass

      public static Class<? extends org.apache.hadoop.mapred.InputFormat<?,?>> getInputFormatClass(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.hive.metastore.api.StorageDescriptor sd, org.apache.hadoop.hive.metastore.api.Table table) throws Exception
      Utility method which gets table or partition InputFormat class. First it tries to get the class name from given StorageDescriptor object. If it doesn't contain it tries to get it from StorageHandler class set in table properties. If not found throws an exception.
      Parameters:
      job - JobConf instance needed incase the table is StorageHandler based table.
      sd - StorageDescriptor instance of currently reading partition or table (for non-partitioned tables).
      table - Table object
      Throws:
      Exception
    • addConfToJob

      public static void addConfToJob(org.apache.hadoop.mapred.JobConf job, Properties properties)
      Utility method which adds give configs to JobConf object.
      Parameters:
      job - JobConf instance.
      properties - New config properties
    • getPartitionMetadata

      public static Properties getPartitionMetadata(HivePartition partition, HiveTableWithColumnCache table)
      Wrapper around MetaStoreUtils#getPartitionMetadata(org.apache.hadoop.hive.metastore.api.Partition, Table) which also adds parameters from table to properties returned by that method.
      Parameters:
      partition - the source of partition level parameters
      table - the source of table level parameters
      Returns:
      properties
    • restoreColumns

      public static void restoreColumns(HiveTableWithColumnCache table, HivePartition partition)
      Sets columns from table cache to table and partition.
      Parameters:
      table - the source of column lists cache
      partition - partition which will set column list
    • getTableMetadata

      public static Properties getTableMetadata(HiveTableWithColumnCache table)
      Wrapper around MetaStoreUtils#getSchema(StorageDescriptor, StorageDescriptor, Map, String, String, List) which also sets columns from table cache to table and returns properties returned by MetaStoreUtils#getSchema(StorageDescriptor, StorageDescriptor, Map, String, String, List).
      Parameters:
      table - Hive table with cached columns
      Returns:
      Hive table metadata
    • throwUnsupportedHiveDataTypeError

      public static void throwUnsupportedHiveDataTypeError(String unsupportedType)
      Generates unsupported types exception message with list of supported types and throws user exception.
      Parameters:
      unsupportedType - unsupported type
    • retrieveIntProperty

      public static int retrieveIntProperty(Properties tableProperties, String propertyName, int defaultValue)
      Returns property value. If property is absent, return given default value. If property value is non-numeric will fail.
      Parameters:
      tableProperties - table properties
      propertyName - property name
      defaultValue - default value used in case if property is absent
      Returns:
      property value
      Throws:
      NumberFormatException - if property value is not numeric
    • hasHeaderOrFooter

      public static boolean hasHeaderOrFooter(HiveTableWithColumnCache table)
      Checks if given table has header or footer. If at least one of them has value more then zero, method will return true.
      Parameters:
      table - table with column cache instance
      Returns:
      true if table contains header or footer, false otherwise
    • verifyAndAddTransactionalProperties

      public static void verifyAndAddTransactionalProperties(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.hive.metastore.api.StorageDescriptor sd)
      This method checks whether the table is transactional and set necessary properties in JobConf.
      If schema evolution properties aren't set in job conf for the input format, method sets the column names and types from table/partition properties or storage descriptor.
      Parameters:
      job - the job to update
      sd - storage descriptor
    • nativeReadersRuleMatches

      public static boolean nativeReadersRuleMatches(org.apache.calcite.plan.RelOptRuleCall call, Class tableInputFormatClass)
      Rule is matched when all of the following match:
      • GroupScan in given DrillScalRel is an HiveScan
      • HiveScan is not already rewritten using Drill's native readers
      • InputFormat in table metadata and all partitions metadata contains the same value
      • No error occurred while checking for the above conditions. An error is logged as warning.
      Parameters:
      call - rule call
      Returns:
      True if the rule can be applied. False otherwise
    • generateHiveConf

      public static org.apache.hadoop.hive.conf.HiveConf generateHiveConf(Map<String,String> properties)
      Creates HiveConf based on given list of configuration properties.
      Parameters:
      properties - config properties
      Returns:
      instance of HiveConf
    • generateHiveConf

      public static org.apache.hadoop.hive.conf.HiveConf generateHiveConf(org.apache.hadoop.hive.conf.HiveConf hiveConf, Map<String,String> properties)
      Creates HiveConf based on properties in given HiveConf and configuration properties.
      Parameters:
      hiveConf - hive conf
      properties - config properties
      Returns:
      instance of HiveConf
    • createPartitionWithSpecColumns

      public static HiveTableWrapper.HivePartitionWrapper createPartitionWithSpecColumns(HiveTableWithColumnCache table, org.apache.hadoop.hive.metastore.api.Partition partition)
      Helper method which stores partition columns in table columnListCache. If table columnListCache has exactly the same columns as partition, in partition stores columns index that corresponds to identical column list. If table columnListCache hasn't such column list, the column list adds to table columnListCache and in partition stores columns index that corresponds to column list.
      Parameters:
      table - hive table instance
      partition - partition instance
      Returns:
      hive partition wrapper
    • getColumnMetadata

      public static ColumnMetadata getColumnMetadata(HiveToRelDataTypeConverter dataTypeConverter, org.apache.hadoop.hive.metastore.api.FieldSchema column)
      Converts specified FieldSchema column into ColumnMetadata. For the case when specified relDataType is struct, map with recursively converted children will be created.
      Parameters:
      dataTypeConverter - converter to obtain Calcite's types from Hive's ones
      column - column to convert
      Returns:
      ColumnMetadata which corresponds to specified FieldSchema column