java.lang.Object
org.apache.drill.exec.physical.resultSet.impl.ContainerState
org.apache.drill.exec.physical.resultSet.impl.ListState
All Implemented Interfaces:
VariantWriter.VariantWriterListener

public class ListState extends ContainerState implements VariantWriter.VariantWriterListener
Represents the contents of a list vector. A list vector is an odd creature. It starts as a list of nothing, evolves to be a nullable array of a single type, then becomes a nullable array of nullable unions holding nullable types.

At the writer level, the list consists of two parts: an array writer and a union writer. The union writer is needed because, unless the client tells us otherwise, we must be prepared for the list to become a union.

Holds the column states for the "columns" that make up the type members of the union, and implements the writer callbacks to add members to a list (disguised as a union), creating the actual union with the number of member types becomes two or more.

This class is similar to the UnionState, except that this version must handle the list transitions from no members to single member to union, and so this class is a bit more complex than the simple union case.

This implementation is based on a desired invariant: that once a client obtains a writer for the list, that writer never becomes invalid. This means we must carefully consider the list lifecycle. The list is represented as an array writer. When the list has no members, there would be no child for the array writer, a call to listArray.entry() would have to return null, which would be awkward and unlike any other writer use case. Once the list has a single type, the call to listArray.entry() might return a writer for that type. But, once the list becomes a repeated union, then listArray.entry() would have to return a union writer. This is the kind of muddy semantics we wish to avoid.

Instead, we model the list as a repeated union at all times. When the list has no type, then the list is a repeated union with no members. Once the list has a member, we have a repeated union of one member type. Finally, when adding another type, we have a repeated union of two types. The key is, in all cases, listArray.entry() returns a UnionWriter, so the client gets a consistent view.

Since the list itself changes form (no type, single type, then union), we hide that lifecycle internal to the writer and to this list state. The result is that the client need not care about the odd list lifecycle. But, on the flip side, this class, and the union writer, must go out of their way to hide these details.

At the writer level, the union writer uses "shims" to map from the union view to the actual list representation (no type, single type or union.)

At this level, this class must handle those cases as well, creating the union (by promoting the list) when needed. The result is a bit complex (for the code here), but simple for the client.

  • Constructor Details

    • ListState

      public ListState(org.apache.drill.exec.physical.resultSet.impl.LoaderInternals loader, ResultVectorCache vectorCache)
  • Method Details

    • variantSchema

      public VariantMetadata variantSchema()
    • listWriter

      public ListWriterImpl listWriter()
    • addType

      public ObjectWriter addType(TypeProtos.MinorType type)
      Specified by:
      addType in interface VariantWriter.VariantWriterListener
    • addMember

      public ObjectWriter addMember(ColumnMetadata member)
      Specified by:
      addMember in interface VariantWriter.VariantWriterListener
    • addColumn

      protected void addColumn(ColumnState colState)
      Add a new column representing a type within the list. This is where the list strangeness occurs. The list starts with no type, then evolves to have a single type (held by the list vector). Upon the second type, the list vector is modified to hold a union vector, which then holds the existing type and the new type. After that, the third and later types simply are added to the union. Very, very ugly, but it is how the list vector works until we improve it...

      We must make three parallel changes:

      • Modify the list vector structure.
      • Modify the union writer structure. (If a list type can evolve, then the writer structure is an array of unions. But, since the union itself does not exist in the 0 and 1 type cases, we use "shims" to model these odd cases.
      • Modify the vector state for the list. If the list is "promoted" to a union, then add the union to the list vector's state for management in vector events.
      Specified by:
      addColumn in class ContainerState
    • setSubColumn

      public void setSubColumn(ColumnState memberState)
      Set the one and only type when building a single-type list.
      Parameters:
      memberState - the column state for the list elements
    • columnStates

      protected Collection<ColumnState> columnStates()
      Specified by:
      columnStates in class ContainerState
    • innerCardinality

      public int innerCardinality()
      Specified by:
      innerCardinality in class ContainerState
    • isVersioned

      protected boolean isVersioned()
      Description copied from class: ContainerState
      Reports whether this container is subject to version management. Version management adds columns to the output container at harvest time based on whether they should appear in the output batch.
      Specified by:
      isVersioned in class ContainerState
      Returns:
      true if versioned