Class AbstractTupleWriter
java.lang.Object
org.apache.drill.exec.vector.accessor.writer.AbstractTupleWriter
- All Implemented Interfaces:
ColumnWriter
,TupleWriter
,WriterEvents
,WriterPosition
- Direct Known Subclasses:
DictEntryWriter
,MapWriter
,RowSetLoaderImpl
,RowSetWriterImpl
Implementation for a writer for a tuple (a row or a map.) Provides access to each
column using either a name or a numeric index.
Notes:
A tuple maintains an internal state needed to handle dynamic column additions. The state identifies the amount of "catch up" needed to get the new column into the same state as the existing columns. The state is also handy for understanding the tuple lifecycle. This lifecycle works for all three cases of:
- Top-level tuple (row).
- Nested tuple (map).
- Array of tuples (repeated map).
Public API | Tuple Event | State Transition | Child Event |
---|---|---|---|
(Start state) | — | IDLE | — |
startBatch() | startWrite() | IDLE → IN_WRITE | startWrite() |
start() (new row) | startRow() | IN_WRITE → IN_ROW | startRow() |
start() (without save) | restartRow() | IN_ROW → IN_ROW | restartRow() |
save() (array) | saveValue() | IN_ROW → IN_ROW | saveValue() |
save() (row) | saveValue() | IN_ROW → IN_ROW | saveValue() |
saveRow() | IN_ROW → IN_WRITE | saveRow() | |
end batch | — | IN_ROW → IDLE | endWrite() |
— | IN_WRITE → IDLE | endWrite() |
- For the top-level tuple, a special case occurs with ending a batch. (The method for doing so differs depending on implementation.) If a row is active, then that row's values are discarded. Then, the batch is ended.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Generic object wrapper for the tuple writer.static interface
Listener (callback) to handle requests to add a new column to a tuple (row or map).Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.TupleWriter
TupleWriter.UndefinedColumnException
Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.writer.WriterEvents
WriterEvents.ColumnWriterListener, WriterEvents.State
-
Field Summary
Modifier and TypeFieldDescriptionprotected ColumnWriterIndex
protected AbstractTupleWriter.TupleWriterListener
protected static final org.slf4j.Logger
protected WriterEvents.State
protected final TupleMetadata
protected ColumnWriterIndex
protected final List<AbstractObjectWriter>
-
Constructor Summary
ModifierConstructorDescriptionprotected
AbstractTupleWriter
(TupleMetadata schema) protected
AbstractTupleWriter
(TupleMetadata schema, List<AbstractObjectWriter> writers) -
Method Summary
Modifier and TypeMethodDescriptionint
addColumn
(MaterializedField field) int
addColumn
(ColumnMetadata column) Add a column to the tuple (row or map) that backs this writer.int
addColumnWriter
(AbstractObjectWriter colWriter) Add a column writer to an existing tuple writer.array
(int colIndex) void
bindIndex
(ColumnWriterIndex index) Bind the writer to a writer index.protected void
bindIndex
(ColumnWriterIndex index, ColumnWriterIndex childIndex) void
void
Bind a listener to the underlying vector writer.column
(int colIndex) void
copy
(ColumnReader from) Copy a single value from the given reader, which must be of the same type as this writer.dict
(int colIndex) void
dump
(HierarchicalFormatter format) void
End a value.void
endWrite()
End a batch: finalize any vector values.boolean
Whether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer.boolean
isProjected
(String columnName) Reports whether the given column is projected.int
Return the last write position in the vector.listener()
boolean
nullable()
Whether this writer allows nulls.void
The vectors backing this writer rolled over.void
The vectors backing this vector are about to roll over.void
During a writer to a row, rewind the the current index position to restart the row.int
Position within the vector of the first value for the current row.void
saveRow()
Saves a row.scalar
(int colIndex) void
Write a value to the given column, automatically calling the propersetType
method for the data.void
setNull()
Set the current value to null.void
Generic technique to write data as a generic Java object.int
size()
void
startRow()
Start a new row.void
Start a write (batch) operation.tuple
(int colIndex) type()
Return the object (structure) type of this writer.type
(int colIndex) variant
(int colIndex) int
Current write index for the writer.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.drill.exec.vector.accessor.ColumnWriter
schema
-
Field Details
-
logger
protected static final org.slf4j.Logger logger -
tupleSchema
-
writers
-
vectorIndex
-
childIndex
-
listener
-
state
-
-
Constructor Details
-
AbstractTupleWriter
-
AbstractTupleWriter
-
-
Method Details
-
type
Description copied from interface:ColumnWriter
Return the object (structure) type of this writer.- Specified by:
type
in interfaceColumnWriter
- Returns:
- type indicating if this is a scalar, tuple or array
-
bindIndex
-
bindIndex
Description copied from interface:WriterEvents
Bind the writer to a writer index.- Specified by:
bindIndex
in interfaceWriterEvents
- Parameters:
index
- the writer index (top level or nested for arrays)
-
rowStartIndex
public int rowStartIndex()Description copied from interface:WriterPosition
Position within the vector of the first value for the current row. Note that this is always the first value for the row, even for a writer deeply nested within a hierarchy of arrays. (The first position for the current array is not exposed in this API.)- Specified by:
rowStartIndex
in interfaceWriterPosition
- Returns:
- the vector offset of the first value for the current row
-
addColumnWriter
Add a column writer to an existing tuple writer. Used for implementations that support "live" schema evolution: column discovery while writing. The corresponding metadata must already have been added to the schema.- Parameters:
colWriter
- the column writer to add
-
isProjected
Description copied from interface:TupleWriter
Reports whether the given column is projected. Useful for clients that can simply skip over unprojected columns.- Specified by:
isProjected
in interfaceTupleWriter
-
addColumn
Description copied from interface:TupleWriter
Add a column to the tuple (row or map) that backs this writer. Support for this operation depends on whether the client code has registered a listener to implement the addition. Throws an exception if no listener is implemented, or if the add request is otherwise invalid (duplicate name, etc.)- Specified by:
addColumn
in interfaceTupleWriter
- Parameters:
column
- the metadata for the column to add- Returns:
- the index of the newly added column which can be used to access the newly added writer
-
addColumn
- Specified by:
addColumn
in interfaceTupleWriter
-
tupleSchema
- Specified by:
tupleSchema
in interfaceTupleWriter
-
size
public int size()- Specified by:
size
in interfaceTupleWriter
-
nullable
public boolean nullable()Description copied from interface:ColumnWriter
Whether this writer allows nulls. This is not as simple as checking for theTypeProtos.DataMode.OPTIONAL
type in the schema. List entries are nullable, if they are primitive, but not if they are maps or lists. Unions are nullable, regardless of cardinality.- Specified by:
nullable
in interfaceColumnWriter
- Returns:
- true if a call to
ColumnWriter.setNull()
is supported, false if not
-
setNull
public void setNull()Description copied from interface:ColumnWriter
Set the current value to null. Support depends on the underlying implementation: only nullable types support this operation. throws IllegalStateException if called on a non-nullable value.- Specified by:
setNull
in interfaceColumnWriter
-
startWrite
public void startWrite()Description copied from interface:WriterEvents
Start a write (batch) operation. Performs any vector initialization required at the start of a batch (especially for offset vectors.)- Specified by:
startWrite
in interfaceWriterEvents
-
startRow
public void startRow()Description copied from interface:WriterEvents
Start a new row. To be called only when a row is not active. To restart a row, callWriterEvents.restartRow()
instead.- Specified by:
startRow
in interfaceWriterEvents
-
endArrayValue
public void endArrayValue()Description copied from interface:WriterEvents
End a value. Similar toWriterEvents.saveRow()
, but the save of a value is conditional on saving the row. This version is primarily of use in tuples nested inside arrays: it saves each tuple within the array, advancing to a new position in the array. The update of the array's offset vector based on the cumulative value saves is done when saving the row.- Specified by:
endArrayValue
in interfaceWriterEvents
-
restartRow
public void restartRow()Description copied from interface:WriterEvents
During a writer to a row, rewind the the current index position to restart the row. Done when abandoning the current row, such as when filtering out a row at read time.- Specified by:
restartRow
in interfaceWriterEvents
-
saveRow
public void saveRow()Description copied from interface:WriterEvents
Saves a row. Commits offset vector locations and advances each to the next position. Can be called only when a row is active.- Specified by:
saveRow
in interfaceWriterEvents
-
preRollover
public void preRollover()Description copied from interface:WriterEvents
The vectors backing this vector are about to roll over. Finish the current batch up to, but not including, the current row.- Specified by:
preRollover
in interfaceWriterEvents
-
postRollover
public void postRollover()Description copied from interface:WriterEvents
The vectors backing this writer rolled over. This means that data for the current row has been rolled over into a new vector. Offsets and indexes should be shifted based on the understanding that data for the current row now resides at the start of a new vector instead of its previous location elsewhere in an old vector.- Specified by:
postRollover
in interfaceWriterEvents
-
endWrite
public void endWrite()Description copied from interface:WriterEvents
End a batch: finalize any vector values.- Specified by:
endWrite
in interfaceWriterEvents
-
copy
Description copied from interface:ColumnWriter
Copy a single value from the given reader, which must be of the same type as this writer.- Specified by:
copy
in interfaceColumnWriter
- Parameters:
from
- reader to provide the data
-
column
- Specified by:
column
in interfaceTupleWriter
-
column
- Specified by:
column
in interfaceTupleWriter
-
set
Description copied from interface:TupleWriter
Write a value to the given column, automatically calling the propersetType
method for the data. While this method is convenient for testing, it incurs quite a bit of type-checking overhead and is not suitable for production code.- Specified by:
set
in interfaceTupleWriter
- Parameters:
colIndex
- the index of the column to setvalue
- the value to set. The type of the object must be compatible with the type of the target column
-
setObject
Description copied from interface:ColumnWriter
Generic technique to write data as a generic Java object. The type of the object must match the target writer. Primarily for testing.- Scalar: The type of the Java object must match the type of the target vector. String or byte[] can be used for Varchar vectors.
- Array: Write the array given an array of values. The object must be a Java array. The type of the array must match the type of element in the repeated vector. That is, if the vector is a Repeated Int, provide an int[] array.
- Tuple (map or row): The Java object must be an array of objects in which the members of the array have a 1:1 correspondence with the members of the tuple in the order defined by the writer metadata. That is, if the map is (Int, Varchar), provide a Object[] array like this: {10, "fred"}.
- Union: Uses the Java object type to determine the type of the backing vector. Creates a vector of the required type if needed.
- Specified by:
setObject
in interfaceColumnWriter
- Parameters:
value
- value to write to the vector. The Java type of the object indicates the Drill storage type
-
scalar
- Specified by:
scalar
in interfaceTupleWriter
-
scalar
- Specified by:
scalar
in interfaceTupleWriter
-
tuple
- Specified by:
tuple
in interfaceTupleWriter
-
tuple
- Specified by:
tuple
in interfaceTupleWriter
-
array
- Specified by:
array
in interfaceTupleWriter
-
array
- Specified by:
array
in interfaceTupleWriter
-
variant
- Specified by:
variant
in interfaceTupleWriter
-
variant
- Specified by:
variant
in interfaceTupleWriter
-
dict
- Specified by:
dict
in interfaceTupleWriter
-
dict
- Specified by:
dict
in interfaceTupleWriter
-
type
- Specified by:
type
in interfaceTupleWriter
-
type
- Specified by:
type
in interfaceTupleWriter
-
isProjected
public boolean isProjected()Description copied from interface:ColumnWriter
Whether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer. This flag handles those special cases where it is helpful to know if the column is projected or not.- Specified by:
isProjected
in interfaceColumnWriter
-
lastWriteIndex
public int lastWriteIndex()Description copied from interface:WriterPosition
Return the last write position in the vector. This may be the same as the writer index position (if the vector was written at that point), or an earlier point. In either case, this value points to the last valid value in the vector.- Specified by:
lastWriteIndex
in interfaceWriterPosition
- Returns:
- index of the last valid value in the vector
-
writeIndex
public int writeIndex()Description copied from interface:WriterPosition
Current write index for the writer. This is the global array location for arrays, same as the row index for top-level columns.- Specified by:
writeIndex
in interfaceWriterPosition
- Returns:
- current write index
-
bindListener
-
listener
-
bindListener
Description copied from interface:WriterEvents
Bind a listener to the underlying vector writer. This listener reports on vector events (overflow, growth), and so is called only when the writer is backed by a vector. The listener is ignored (and never called) for dummy (non-projected) columns. If the column is compound (such as for a nullable or repeated column, or for a map), then the writer is bound to the individual components.- Specified by:
bindListener
in interfaceWriterEvents
- Parameters:
listener
- the vector event listener to bind
-
dump
- Specified by:
dump
in interfaceWriterEvents
-