Class UnionWriterImpl
java.lang.Object
org.apache.drill.exec.vector.accessor.writer.UnionWriterImpl
- All Implemented Interfaces:
ColumnWriter,VariantWriter,WriterEvents,WriterPosition
Writer to a union vector.
A union vector has three attributes: null flag, type and value. The union vector holds the type: a bundle of other vectors hold the value. The type says which of the other vectors to consult to write the value. If a column is null, then we consult no other vectors. If all columns (thus far) are null, then there are no associated data vectors.
The protocol is to first set the type. Doing so creates the associated data vector, if it does not yet exist. This highlights the poor design of this vector: if we have even one value of a given type, we must have a vector that holds values for all rows, then we ignore the unwanted values.
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.VariantWriter
VariantWriter.VariantWriterListenerNested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.writer.WriterEvents
WriterEvents.ColumnWriterListener, WriterEvents.State -
Constructor Summary
ConstructorsConstructorDescriptionUnionWriterImpl(ColumnMetadata schema) UnionWriterImpl(ColumnMetadata schema, UnionVector vector, AbstractObjectWriter[] variants) -
Method Summary
Modifier and TypeMethodDescriptionaddMember(ColumnMetadata colSchema) protected voidaddMember(AbstractObjectWriter writer) Add a column writer to an existing union writer.array()voidbindIndex(ColumnWriterIndex index) Bind the writer to a writer index.voidvoidBind a listener to the underlying vector writer.voidvoidcopy(ColumnReader from) Copy a single value from the given reader, which must be of the same type as this writer.voiddump(HierarchicalFormatter format) voidEnd a value.voidendWrite()End a batch: finalize any vector values.booleanhasType(TypeProtos.MinorType type) Determine if the union vector has materialized storage for the given type.index()booleanWhether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer.intReturn the last write position in the vector.listener()member(TypeProtos.MinorType type) Set the type of the present value and get the writer for that type.Create or retrieve a writer for the given type.booleannullable()Whether this writer allows nulls.voidThe vectors backing this writer rolled over.voidThe vectors backing this vector are about to roll over.voidDuring a writer to a row, rewind the the current index position to restart the row.intPosition within the vector of the first value for the current row.voidsaveRow()Saves a row.scalar(TypeProtos.MinorType type) schema()Returns the schema of the column associated with this writer.voidsetNull()Set the current value to null.voidGeneric technique to write data as a generic Java object.voidsetType(TypeProtos.MinorType type) Explicitly set the type of the present value.shim()intsize()Returns the number of types in the variant.voidstartRow()Start a new row.voidStart a write (batch) operation.state()tuple()type()Return the object (structure) type of this writer.Metadata description of the variant that includes the set of types, along with extended properties of the types such as expected allocations sizes, expected array cardinality, etc.intCurrent write index for the writer.
-
Constructor Details
-
UnionWriterImpl
-
UnionWriterImpl
-
-
Method Details
-
bindIndex
Description copied from interface:WriterEventsBind the writer to a writer index.- Specified by:
bindIndexin interfaceWriterEvents- Parameters:
index- the writer index (top level or nested for arrays)
-
bindListener
-
bindListener
Description copied from interface:WriterEventsBind a listener to the underlying vector writer. This listener reports on vector events (overflow, growth), and so is called only when the writer is backed by a vector. The listener is ignored (and never called) for dummy (non-projected) columns. If the column is compound (such as for a nullable or repeated column, or for a map), then the writer is bound to the individual components.- Specified by:
bindListenerin interfaceWriterEvents- Parameters:
listener- the vector event listener to bind
-
state
-
index
-
listener
-
shim
-
elementPosition
-
bindShim
-
type
Description copied from interface:ColumnWriterReturn the object (structure) type of this writer.- Specified by:
typein interfaceColumnWriter- Returns:
- type indicating if this is a scalar, tuple or array
-
nullable
public boolean nullable()Description copied from interface:ColumnWriterWhether this writer allows nulls. This is not as simple as checking for theTypeProtos.DataMode.OPTIONALtype in the schema. List entries are nullable, if they are primitive, but not if they are maps or lists. Unions are nullable, regardless of cardinality.- Specified by:
nullablein interfaceColumnWriter- Returns:
- true if a call to
ColumnWriter.setNull()is supported, false if not
-
schema
Description copied from interface:ColumnWriterReturns the schema of the column associated with this writer.- Specified by:
schemain interfaceColumnWriter- Returns:
- schema for this writer's column
-
variantSchema
Description copied from interface:VariantWriterMetadata description of the variant that includes the set of types, along with extended properties of the types such as expected allocations sizes, expected array cardinality, etc.- Specified by:
variantSchemain interfaceVariantWriter- Returns:
- metadata for the variant
-
size
public int size()Description copied from interface:VariantWriterReturns the number of types in the variant. Some implementations (such as lists) impart special meaning to a variant with a single type.- Specified by:
sizein interfaceVariantWriter- Returns:
- number of types in the variant
-
hasType
Description copied from interface:VariantWriterDetermine if the union vector has materialized storage for the given type. (The storage will be created as needed during writing.)- Specified by:
hasTypein interfaceVariantWriter- Parameters:
type- data type- Returns:
trueif a value of the given type has been written and storage allocated (or storage was allocated implicitly),falseotherwise
-
setNull
public void setNull()Description copied from interface:ColumnWriterSet the current value to null. Support depends on the underlying implementation: only nullable types support this operation. throws IllegalStateException if called on a non-nullable value.- Specified by:
setNullin interfaceColumnWriter
-
memberWriter
Description copied from interface:VariantWriterCreate or retrieve a writer for the given type. Use this form when caching writers. This form does not set the type of the current row; callVariantWriter.setType(MinorType)per row when the writers are cached. This method can be called at any time as it does not depend on an active batch.- Specified by:
memberWriterin interfaceVariantWriter- Parameters:
type- the type of the writer to cache- Returns:
- the writer for that type without setting the type of the current row.
-
member
Description copied from interface:VariantWriterSet the type of the present value and get the writer for that type. Available only when a batch is active. Use this form to declare the type of the current row, and retrieve a writer for that value.- Specified by:
memberin interfaceVariantWriter- Parameters:
type- type to set for the current row- Returns:
- writer for the type just set
-
setType
Description copied from interface:VariantWriterExplicitly set the type of the present value. Use this when the writers are cached. The writer must already exist.- Specified by:
setTypein interfaceVariantWriter- Parameters:
type- type to set for the current row
-
addMember
- Specified by:
addMemberin interfaceVariantWriter
-
addMember
- Specified by:
addMemberin interfaceVariantWriter
-
addMember
Add a column writer to an existing union writer. Used for implementations that support "live" schema evolution: column discovery while writing. The corresponding metadata must already have been added to the schema. Called by the shim's addMember to do writer-level tasks.- Parameters:
writer- the column writer to add
-
scalar
- Specified by:
scalarin interfaceVariantWriter
-
tuple
- Specified by:
tuplein interfaceVariantWriter
-
array
- Specified by:
arrayin interfaceVariantWriter
-
isProjected
public boolean isProjected()Description copied from interface:ColumnWriterWhether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer. This flag handles those special cases where it is helpful to know if the column is projected or not.- Specified by:
isProjectedin interfaceColumnWriter
-
startWrite
public void startWrite()Description copied from interface:WriterEventsStart a write (batch) operation. Performs any vector initialization required at the start of a batch (especially for offset vectors.)- Specified by:
startWritein interfaceWriterEvents
-
startRow
public void startRow()Description copied from interface:WriterEventsStart a new row. To be called only when a row is not active. To restart a row, callWriterEvents.restartRow()instead.- Specified by:
startRowin interfaceWriterEvents
-
endArrayValue
public void endArrayValue()Description copied from interface:WriterEventsEnd a value. Similar toWriterEvents.saveRow(), but the save of a value is conditional on saving the row. This version is primarily of use in tuples nested inside arrays: it saves each tuple within the array, advancing to a new position in the array. The update of the array's offset vector based on the cumulative value saves is done when saving the row.- Specified by:
endArrayValuein interfaceWriterEvents
-
restartRow
public void restartRow()Description copied from interface:WriterEventsDuring a writer to a row, rewind the the current index position to restart the row. Done when abandoning the current row, such as when filtering out a row at read time.- Specified by:
restartRowin interfaceWriterEvents
-
saveRow
public void saveRow()Description copied from interface:WriterEventsSaves a row. Commits offset vector locations and advances each to the next position. Can be called only when a row is active.- Specified by:
saveRowin interfaceWriterEvents
-
preRollover
public void preRollover()Description copied from interface:WriterEventsThe vectors backing this vector are about to roll over. Finish the current batch up to, but not including, the current row.- Specified by:
preRolloverin interfaceWriterEvents
-
postRollover
public void postRollover()Description copied from interface:WriterEventsThe vectors backing this writer rolled over. This means that data for the current row has been rolled over into a new vector. Offsets and indexes should be shifted based on the understanding that data for the current row now resides at the start of a new vector instead of its previous location elsewhere in an old vector.- Specified by:
postRolloverin interfaceWriterEvents
-
endWrite
public void endWrite()Description copied from interface:WriterEventsEnd a batch: finalize any vector values.- Specified by:
endWritein interfaceWriterEvents
-
lastWriteIndex
public int lastWriteIndex()Description copied from interface:WriterPositionReturn the last write position in the vector. This may be the same as the writer index position (if the vector was written at that point), or an earlier point. In either case, this value points to the last valid value in the vector.- Specified by:
lastWriteIndexin interfaceWriterPosition- Returns:
- index of the last valid value in the vector
-
rowStartIndex
public int rowStartIndex()Description copied from interface:WriterPositionPosition within the vector of the first value for the current row. Note that this is always the first value for the row, even for a writer deeply nested within a hierarchy of arrays. (The first position for the current array is not exposed in this API.)- Specified by:
rowStartIndexin interfaceWriterPosition- Returns:
- the vector offset of the first value for the current row
-
writeIndex
public int writeIndex()Description copied from interface:WriterPositionCurrent write index for the writer. This is the global array location for arrays, same as the row index for top-level columns.- Specified by:
writeIndexin interfaceWriterPosition- Returns:
- current write index
-
copy
Description copied from interface:ColumnWriterCopy a single value from the given reader, which must be of the same type as this writer.- Specified by:
copyin interfaceColumnWriter- Parameters:
from- reader to provide the data
-
setObject
Description copied from interface:ColumnWriterGeneric technique to write data as a generic Java object. The type of the object must match the target writer. Primarily for testing.- Scalar: The type of the Java object must match the type of the target vector. String or byte[] can be used for Varchar vectors.
- Array: Write the array given an array of values. The object must be a Java array. The type of the array must match the type of element in the repeated vector. That is, if the vector is a Repeated Int, provide an int[] array.
- Tuple (map or row): The Java object must be an array of objects in which the members of the array have a 1:1 correspondence with the members of the tuple in the order defined by the writer metadata. That is, if the map is (Int, Varchar), provide a Object[] array like this: {10, "fred"}.
- Union: Uses the Java object type to determine the type of the backing vector. Creates a vector of the required type if needed.
- Specified by:
setObjectin interfaceColumnWriter- Parameters:
value- value to write to the vector. The Java type of the object indicates the Drill storage type
-
dump
- Specified by:
dumpin interfaceWriterEvents
-