Interface VariantMetadata

All Known Implementing Classes:
VariantSchema

public interface VariantMetadata
Describes the contents of a list or union field. Such fields are, in effect, a map from minor type to vector, represented here as a map from minor type to column metadata. The child columns used here are a useful fiction. The column name is made up to be the same as the name of the type.

In Drill, a union and a list are related, but distinct. In metadata, a union is an optional variant while a list is a variant array. This makes the representation simpler and should be a good-enough approximation of reality.

Variants can contain three kinds of children:

  • Nullable (optional) scalar vectors.
  • Non-nullable (required) map.
  • Nullable (optional) list.

A union cannot contain a repeated vector. Instead, the union can contain a list. Note also that maps can never be optional, so they are required in the union, even though the map is, in effect, optional (the map is effectively null if it is not used for a give row.) Yes, this is confusing, but it is how the vectors are implemented (for now.)

A list type is modeled here as a repeated union type. This is not entirely accurate, but it is another useful fiction. (In actual implementation, a list is either a single type, or a array of unions. This detail is abstracted away here.)

In vector implementation, unions declare their member types, but lists don't. Here, both types declare their member types. (Another useful fiction.)

A union or list can contain a map. Maps have structure. To support this, the metadata allows adding a map column that contains the map structure. Such metadata exist only in this system; it is not easily accessible in the vector implementation.

A union or list can contain a list (though not a union.) As described here, lists can have structure, and so, like maps, can be built using a column that provides that structure.

Note that the Drill UNION and LIST implementations are considered experimental and are not generally enabled. As a result, this metadata schema must also be considered experimental and subject to change.

  • Method Details

    • addType

      Add any supported type to the variant.

      At present, the union vector does not support the decimal types. This class does not reject such types; but they will cause a runtime exception when code asks the union vector for these types.

      Parameters:
      type - type to add
      Returns:
      the "virtual" column for that type
      Throws:
      IllegalArgumentException - if the type has already been added
    • addType

      void addType(ColumnMetadata col)
      Add a column for any supported type to the variant. Use this to add structure to a list or map member.
      Parameters:
      col - column to add. The column must have the correct mode. The column's type is used as the type key
      Throws:
      IllegalArgumentException - if the type has already been added, or if the mode is wrong
    • size

      int size()
      Returns the number of types in the variant.
      Returns:
      the number of types in the variant
    • hasType

      boolean hasType(TypeProtos.MinorType type)
      Determine if the given type is a member of the variant.
      Parameters:
      type - type to check
      Returns:
      true if the type is a member, false if not
    • types

      Returns the list of types which are members of this variant.
      Returns:
      the list of types
    • members

    • member

      Retrieve the virtual column for a given type.
      Parameters:
      type - the type key
      Returns:
      the virtual column, or null if the type is not a member of the variant
    • parent

      ColumnMetadata parent()
      Return the column that defines this variant structure
      Returns:
      the column that returns this variant structure from its variantSchema() method
    • isSingleType

      boolean isSingleType()
      A list is defined as a list of variants at the metadata layer. But, in implementation, a list will do special processing if the variant (union) contains only one type.
      Returns:
      true if this variant contains only one type, false if the variant contains 0, 2 or more types
    • listSubtype

      ColumnMetadata listSubtype()
      Lists are odd creatures: they contain a union if they have more than one subtype, but are like a nullable repeated type if they contain only one type. This method returns the type of the array: either the single type (if isSingleType() is true) or a reference to the synthetic union column nested inside the list.
      Returns:
      the metadata for the implicit column within the list
    • becomeSimple

      void becomeSimple()
    • isSimple

      boolean isSimple()