Class ListState
- All Implemented Interfaces:
VariantWriter.VariantWriterListener
At the writer level, the list consists of two parts: an array writer and a union writer. The union writer is needed because, unless the client tells us otherwise, we must be prepared for the list to become a union.
Holds the column states for the "columns" that make up the type members of the union, and implements the writer callbacks to add members to a list (disguised as a union), creating the actual union with the number of member types becomes two or more.
This class is similar to the UnionState
, except that this
version must handle the list transitions from no members to single
member to union, and so this class is a bit more complex than the
simple union case.
This implementation is based on a desired invariant: that once a client obtains a writer for the list, that writer never becomes invalid. This means we must carefully consider the list lifecycle. The list is represented as an array writer. When the list has no members, there would be no child for the array writer, a call to listArray.entry() would have to return null, which would be awkward and unlike any other writer use case. Once the list has a single type, the call to listArray.entry() might return a writer for that type. But, once the list becomes a repeated union, then listArray.entry() would have to return a union writer. This is the kind of muddy semantics we wish to avoid.
Instead, we model the list as a repeated union at all times. When the
list has no type, then the list is a repeated union with no members.
Once the list has a member, we have a repeated union of one member type.
Finally, when adding another type, we have a repeated union of two
types. The key is, in all cases, listArray.entry() returns
a UnionWriter
, so the client gets a consistent view.
Since the list itself changes form (no type, single type, then union), we hide that lifecycle internal to the writer and to this list state. The result is that the client need not care about the odd list lifecycle. But, on the flip side, this class, and the union writer, must go out of their way to hide these details.
At the writer level, the union writer uses "shims" to map from the union view to the actual list representation (no type, single type or union.)
At this level, this class must handle those cases as well, creating the union (by promoting the list) when needed. The result is a bit complex (for the code here), but simple for the client.
-
Nested Class Summary
Modifier and TypeClassDescriptionprotected static class
Wrapper around the list vector (and its optional contained union). -
Field Summary
Fields inherited from class org.apache.drill.exec.physical.resultSet.impl.ContainerState
loader, parentColumn, projectionSet, vectorCache
-
Constructor Summary
ConstructorDescriptionListState
(org.apache.drill.exec.physical.resultSet.impl.LoaderInternals loader, ResultVectorCache vectorCache) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
addColumn
(ColumnState colState) Add a new column representing a type within the list.addMember
(ColumnMetadata member) addType
(TypeProtos.MinorType type) protected Collection<ColumnState>
int
protected boolean
Reports whether this container is subject to version management.void
setSubColumn
(ColumnState memberState) Set the one and only type when building a single-type list.Methods inherited from class org.apache.drill.exec.physical.resultSet.impl.ContainerState
addColumn, bindColumnState, close, harvestWithLookAhead, loader, projection, rollover, startBatch, updateCardinality, vectorCache
-
Constructor Details
-
ListState
public ListState(org.apache.drill.exec.physical.resultSet.impl.LoaderInternals loader, ResultVectorCache vectorCache)
-
-
Method Details
-
variantSchema
-
listWriter
-
addType
- Specified by:
addType
in interfaceVariantWriter.VariantWriterListener
-
addMember
- Specified by:
addMember
in interfaceVariantWriter.VariantWriterListener
-
addColumn
Add a new column representing a type within the list. This is where the list strangeness occurs. The list starts with no type, then evolves to have a single type (held by the list vector). Upon the second type, the list vector is modified to hold a union vector, which then holds the existing type and the new type. After that, the third and later types simply are added to the union. Very, very ugly, but it is how the list vector works until we improve it...We must make three parallel changes:
- Modify the list vector structure.
- Modify the union writer structure. (If a list type can evolve, then the writer structure is an array of unions. But, since the union itself does not exist in the 0 and 1 type cases, we use "shims" to model these odd cases.
- Modify the vector state for the list. If the list is "promoted" to a union, then add the union to the list vector's state for management in vector events.
- Specified by:
addColumn
in classContainerState
-
setSubColumn
Set the one and only type when building a single-type list.- Parameters:
memberState
- the column state for the list elements
-
columnStates
- Specified by:
columnStates
in classContainerState
-
innerCardinality
public int innerCardinality()- Specified by:
innerCardinality
in classContainerState
-
isVersioned
protected boolean isVersioned()Description copied from class:ContainerState
Reports whether this container is subject to version management. Version management adds columns to the output container at harvest time based on whether they should appear in the output batch.- Specified by:
isVersioned
in classContainerState
- Returns:
true
if versioned
-