org.apache.drill.exec.vector.accessor.writer.AbstractTupleWriter

All Implemented Interfaces:: ColumnWriter, TupleWriter, WriterEvents, WriterPosition

Direct Known Subclasses:: DictEntryWriter, MapWriter, RowSetLoaderImpl, RowSetWriterImpl

public abstract class AbstractTupleWriter extends Object implements TupleWriter, WriterEvents

Implementation for a writer for a tuple (a row or a map.) Provides access to each column using either a name or a numeric index.

A tuple maintains an internal state needed to handle dynamic column additions. The state identifies the amount of "catch up" needed to get the new column into the same state as the existing columns. The state is also handy for understanding the tuple lifecycle. This lifecycle works for all three cases of:

Top-level tuple (row).
Nested tuple (map).
Array of tuples (repeated map).

Specifically, the transitions, for batch, row and array events, are:

Public API	Tuple Event	State Transition	Child Event
(Start state)	—	IDLE	—
startBatch()	startWrite()	IDLE → IN_WRITE	startWrite()
start() (new row)	startRow()	IN_WRITE → IN_ROW	startRow()
start() (without save)	restartRow()	IN_ROW → IN_ROW	restartRow()
save() (array)	saveValue()	IN_ROW → IN_ROW	saveValue()
save() (row)	saveValue()	IN_ROW → IN_ROW	saveValue()
save() (row)	saveRow()	IN_ROW → IN_WRITE	saveRow()
end batch	—	IN_ROW → IDLE	endWrite()
end batch	—	IN_WRITE → IDLE	endWrite()

Notes:

For the top-level tuple, a special case occurs with ending a batch. (The method for doing so differs depending on implementation.) If a row is active, then that row's values are discarded. Then, the batch is ended.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

AbstractTupleWriter.TupleObjectWriter

Generic object wrapper for the tuple writer.

static interface

AbstractTupleWriter.TupleWriterListener

Listener (callback) to handle requests to add a new column to a tuple (row or map).

Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.TupleWriter
TupleWriter.UndefinedColumnException

Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.writer.WriterEvents
WriterEvents.ColumnWriterListener, WriterEvents.State
Field Summary

Fields

Modifier and Type

Field

Description

protected ColumnWriterIndex

childIndex

protected AbstractTupleWriter.TupleWriterListener

listener

protected static final org.slf4j.Logger

logger

protected WriterEvents.State

state

protected final TupleMetadata

tupleSchema

protected ColumnWriterIndex

vectorIndex

protected final List<AbstractObjectWriter>

writers
Constructor Summary

Constructors

Modifier

Constructor

Description

protected

AbstractTupleWriter(TupleMetadata schema)

protected

AbstractTupleWriter(TupleMetadata schema, List<AbstractObjectWriter> writers)
Method Summary

Modifier and Type

Method

Description

int

addColumn(MaterializedField field)

int

addColumn(ColumnMetadata column)

Add a column to the tuple (row or map) that backs this writer.

int

addColumnWriter(AbstractObjectWriter colWriter)

Add a column writer to an existing tuple writer.

ArrayWriter

array(int colIndex)

ArrayWriter

array(String colName)

void

bindIndex(ColumnWriterIndex index)

Bind the writer to a writer index.

protected void

bindIndex(ColumnWriterIndex index, ColumnWriterIndex childIndex)

void

bindListener(AbstractTupleWriter.TupleWriterListener listener)

void

bindListener(WriterEvents.ColumnWriterListener listener)

Bind a listener to the underlying vector writer.

ObjectWriter

column(int colIndex)

ObjectWriter

column(String colName)

void

copy(ColumnReader from)

Copy a single value from the given reader, which must be of the same type as this writer.

DictWriter

dict(int colIndex)

DictWriter

dict(String colName)

void

dump(HierarchicalFormatter format)

void

endArrayValue()

End a value.

void

endWrite()

End a batch: finalize any vector values.

boolean

isProjected()

Whether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer.

boolean

isProjected(String columnName)

Reports whether the given column is projected.

int

lastWriteIndex()

Return the last write position in the vector.

AbstractTupleWriter.TupleWriterListener

listener()

boolean

nullable()

Whether this writer allows nulls.

void

postRollover()

The vectors backing this writer rolled over.

void

preRollover()

The vectors backing this vector are about to roll over.

void

restartRow()

During a writer to a row, rewind the the current index position to restart the row.

int

rowStartIndex()

Position within the vector of the first value for the current row.

void

saveRow()

Saves a row.

ScalarWriter

scalar(int colIndex)

ScalarWriter

scalar(String colName)

void

set(int colIndex, Object value)

Write a value to the given column, automatically calling the proper setType method for the data.

void

setNull()

Set the current value to null.

void

setObject(Object value)

Generic technique to write data as a generic Java object.

int

size()

void

startRow()

Start a new row.

void

startWrite()

Start a write (batch) operation.

TupleWriter

tuple(int colIndex)

TupleWriter

tuple(String colName)

TupleMetadata

tupleSchema()

ObjectType

type()

Return the object (structure) type of this writer.

ObjectType

type(int colIndex)

ObjectType

type(String colName)

VariantWriter

variant(int colIndex)

VariantWriter

variant(String colName)

int

writeIndex()

Current write index for the writer.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.drill.exec.vector.accessor.ColumnWriter
schema

Field Details
- logger
  
  protected static final org.slf4j.Logger logger
- tupleSchema
  
  protected final TupleMetadata tupleSchema
- writers
  
  protected final List<AbstractObjectWriter> writers
- vectorIndex
  
  protected ColumnWriterIndex vectorIndex
- childIndex
  
  protected ColumnWriterIndex childIndex
- listener
  
  protected AbstractTupleWriter.TupleWriterListener listener
- state
  
  protected WriterEvents.State state
Constructor Details
- AbstractTupleWriter
  
  protected AbstractTupleWriter(TupleMetadata schema, List<AbstractObjectWriter> writers)
- AbstractTupleWriter
  
  protected AbstractTupleWriter(TupleMetadata schema)
Method Details
- type
  
  public ObjectType type()
  
  Description copied from interface: ColumnWriter
  
  Return the object (structure) type of this writer.
  
  Specified by:
  
  type in interface ColumnWriter
  
  Returns:
  
  type indicating if this is a scalar, tuple or array
- bindIndex
  
  protected void bindIndex(ColumnWriterIndex index, ColumnWriterIndex childIndex)
- bindIndex
  
  public void bindIndex(ColumnWriterIndex index)
  
  Description copied from interface: WriterEvents
  
  Bind the writer to a writer index.
  
  Specified by:
  
  bindIndex in interface WriterEvents
  
  Parameters:
  
  index - the writer index (top level or nested for arrays)
- rowStartIndex
  
  public int rowStartIndex()
  
  Description copied from interface: WriterPosition
  
  Position within the vector of the first value for the current row. Note that this is always the first value for the row, even for a writer deeply nested within a hierarchy of arrays. (The first position for the current array is not exposed in this API.)
  
  Specified by:
  
  rowStartIndex in interface WriterPosition
  
  Returns:
  
  the vector offset of the first value for the current row
- addColumnWriter
  
  public int addColumnWriter(AbstractObjectWriter colWriter)
  
  Add a column writer to an existing tuple writer. Used for implementations that support "live" schema evolution: column discovery while writing. The corresponding metadata must already have been added to the schema.
  
  Parameters:
  
  colWriter - the column writer to add
- isProjected
  
  public boolean isProjected(String columnName)
  
  Description copied from interface: TupleWriter
  
  Reports whether the given column is projected. Useful for clients that can simply skip over unprojected columns.
  
  Specified by:
  
  isProjected in interface TupleWriter
- addColumn
  
  public int addColumn(ColumnMetadata column)
  
  Description copied from interface: TupleWriter
  
  Add a column to the tuple (row or map) that backs this writer. Support for this operation depends on whether the client code has registered a listener to implement the addition. Throws an exception if no listener is implemented, or if the add request is otherwise invalid (duplicate name, etc.)
  
  Specified by:
  
  addColumn in interface TupleWriter
  
  Parameters:
  
  column - the metadata for the column to add
  
  Returns:
  
  the index of the newly added column which can be used to access the newly added writer
- addColumn
  
  public int addColumn(MaterializedField field)
  
  Specified by:
  
  addColumn in interface TupleWriter
- tupleSchema
  
  public TupleMetadata tupleSchema()
  
  Specified by:
  
  tupleSchema in interface TupleWriter
- size
  
  public int size()
  
  Specified by:
  
  size in interface TupleWriter
- nullable
  
  public boolean nullable()
  
  Description copied from interface: ColumnWriter
  
  Whether this writer allows nulls. This is not as simple as checking for the TypeProtos.DataMode.OPTIONAL type in the schema. List entries are nullable, if they are primitive, but not if they are maps or lists. Unions are nullable, regardless of cardinality.
  
  Specified by:
  
  nullable in interface ColumnWriter
  
  Returns:
  
  true if a call to ColumnWriter.setNull() is supported, false if not
- setNull
  
  public void setNull()
  
  Description copied from interface: ColumnWriter
  
  Set the current value to null. Support depends on the underlying implementation: only nullable types support this operation. throws IllegalStateException if called on a non-nullable value.
  
  Specified by:
  
  setNull in interface ColumnWriter
- startWrite
  
  public void startWrite()
  
  Description copied from interface: WriterEvents
  
  Start a write (batch) operation. Performs any vector initialization required at the start of a batch (especially for offset vectors.)
  
  Specified by:
  
  startWrite in interface WriterEvents
- startRow
  
  public void startRow()
  
  Description copied from interface: WriterEvents
  
  Start a new row. To be called only when a row is not active. To restart a row, call WriterEvents.restartRow() instead.
  
  Specified by:
  
  startRow in interface WriterEvents
- endArrayValue
  
  public void endArrayValue()
  
  Description copied from interface: WriterEvents
  
  End a value. Similar to WriterEvents.saveRow(), but the save of a value is conditional on saving the row. This version is primarily of use in tuples nested inside arrays: it saves each tuple within the array, advancing to a new position in the array. The update of the array's offset vector based on the cumulative value saves is done when saving the row.
  
  Specified by:
  
  endArrayValue in interface WriterEvents
- restartRow
  
  public void restartRow()
  
  Description copied from interface: WriterEvents
  
  During a writer to a row, rewind the the current index position to restart the row. Done when abandoning the current row, such as when filtering out a row at read time.
  
  Specified by:
  
  restartRow in interface WriterEvents
- saveRow
  
  public void saveRow()
  
  Description copied from interface: WriterEvents
  
  Saves a row. Commits offset vector locations and advances each to the next position. Can be called only when a row is active.
  
  Specified by:
  
  saveRow in interface WriterEvents
- preRollover
  
  public void preRollover()
  
  Description copied from interface: WriterEvents
  
  The vectors backing this vector are about to roll over. Finish the current batch up to, but not including, the current row.
  
  Specified by:
  
  preRollover in interface WriterEvents
- postRollover
  
  public void postRollover()
  
  Description copied from interface: WriterEvents
  
  The vectors backing this writer rolled over. This means that data for the current row has been rolled over into a new vector. Offsets and indexes should be shifted based on the understanding that data for the current row now resides at the start of a new vector instead of its previous location elsewhere in an old vector.
  
  Specified by:
  
  postRollover in interface WriterEvents
- endWrite
  
  public void endWrite()
  
  Description copied from interface: WriterEvents
  
  End a batch: finalize any vector values.
  
  Specified by:
  
  endWrite in interface WriterEvents
- copy
  
  public void copy(ColumnReader from)
  
  Description copied from interface: ColumnWriter
  
  Copy a single value from the given reader, which must be of the same type as this writer.
  
  Specified by:
  
  copy in interface ColumnWriter
  
  Parameters:
  
  from - reader to provide the data
- column
  
  public ObjectWriter column(int colIndex)
  
  Specified by:
  
  column in interface TupleWriter
- column
  
  public ObjectWriter column(String colName)
  
  Specified by:
  
  column in interface TupleWriter
- set
  
  public void set(int colIndex, Object value)
  
  Description copied from interface: TupleWriter
  
  Write a value to the given column, automatically calling the proper setType method for the data. While this method is convenient for testing, it incurs quite a bit of type-checking overhead and is not suitable for production code.
  
  Specified by:
  
  set in interface TupleWriter
  
  Parameters:
  
  colIndex - the index of the column to set
  
  value - the value to set. The type of the object must be compatible with the type of the target column
- setObject
  
  public void setObject(Object value)
  
  Description copied from interface: ColumnWriter
  Generic technique to write data as a generic Java object. The type of the object must match the target writer. Primarily for testing.
  
  Scalar: The type of the Java object must match the type of the target vector. String or byte[] can be used for Varchar vectors.
  
  Array: Write the array given an array of values. The object must be a Java array. The type of the array must match the type of element in the repeated vector. That is, if the vector is a Repeated Int, provide an int[] array.
  
  Tuple (map or row): The Java object must be an array of objects in which the members of the array have a 1:1 correspondence with the members of the tuple in the order defined by the writer metadata. That is, if the map is (Int, Varchar), provide a Object[] array like this: {10, "fred"}.
  
  Union: Uses the Java object type to determine the type of the backing vector. Creates a vector of the required type if needed.
  Specified by:
  
  setObject in interface ColumnWriter
  
  Parameters:
  
  value - value to write to the vector. The Java type of the object indicates the Drill storage type
- scalar
  
  public ScalarWriter scalar(int colIndex)
  
  Specified by:
  
  scalar in interface TupleWriter
- scalar
  
  public ScalarWriter scalar(String colName)
  
  Specified by:
  
  scalar in interface TupleWriter
- tuple
  
  public TupleWriter tuple(int colIndex)
  
  Specified by:
  
  tuple in interface TupleWriter
- tuple
  
  public TupleWriter tuple(String colName)
  
  Specified by:
  
  tuple in interface TupleWriter
- array
  
  public ArrayWriter array(int colIndex)
  
  Specified by:
  
  array in interface TupleWriter
- array
  
  public ArrayWriter array(String colName)
  
  Specified by:
  
  array in interface TupleWriter
- variant
  
  public VariantWriter variant(int colIndex)
  
  Specified by:
  
  variant in interface TupleWriter
- variant
  
  public VariantWriter variant(String colName)
  
  Specified by:
  
  variant in interface TupleWriter
- dict
  
  public DictWriter dict(int colIndex)
  
  Specified by:
  
  dict in interface TupleWriter
- dict
  
  public DictWriter dict(String colName)
  
  Specified by:
  
  dict in interface TupleWriter
- type
  
  public ObjectType type(int colIndex)
  
  Specified by:
  
  type in interface TupleWriter
- type
  
  public ObjectType type(String colName)
  
  Specified by:
  
  type in interface TupleWriter
- isProjected
  
  public boolean isProjected()
  
  Description copied from interface: ColumnWriter
  
  Whether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer. This flag handles those special cases where it is helpful to know if the column is projected or not.
  
  Specified by:
  
  isProjected in interface ColumnWriter
- lastWriteIndex
  
  public int lastWriteIndex()
  
  Description copied from interface: WriterPosition
  
  Return the last write position in the vector. This may be the same as the writer index position (if the vector was written at that point), or an earlier point. In either case, this value points to the last valid value in the vector.
  
  Specified by:
  
  lastWriteIndex in interface WriterPosition
  
  Returns:
  
  index of the last valid value in the vector
- writeIndex
  
  public int writeIndex()
  
  Description copied from interface: WriterPosition
  
  Current write index for the writer. This is the global array location for arrays, same as the row index for top-level columns.
  
  Specified by:
  
  writeIndex in interface WriterPosition
  
  Returns:
  
  current write index
- bindListener
  
  public void bindListener(AbstractTupleWriter.TupleWriterListener listener)
- listener
  
  public AbstractTupleWriter.TupleWriterListener listener()
- bindListener
  
  public void bindListener(WriterEvents.ColumnWriterListener listener)
  
  Description copied from interface: WriterEvents
  
  Bind a listener to the underlying vector writer. This listener reports on vector events (overflow, growth), and so is called only when the writer is backed by a vector. The listener is ignored (and never called) for dummy (non-projected) columns. If the column is compound (such as for a nullable or repeated column, or for a map), then the writer is bound to the individual components.
  
  Specified by:
  
  bindListener in interface WriterEvents
  
  Parameters:
  
  listener - the vector event listener to bind
- dump
  
  public void dump(HierarchicalFormatter format)
  
  Specified by:
  
  dump in interface WriterEvents

Class AbstractTupleWriter

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.TupleWriter

Nested classes/interfaces inherited from interface org.apache.drill.exec.vector.accessor.writer.WriterEvents

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.drill.exec.vector.accessor.ColumnWriter

Field Details

logger

tupleSchema

writers

vectorIndex

childIndex

listener

state

Constructor Details

AbstractTupleWriter

AbstractTupleWriter

Method Details

type

bindIndex

bindIndex

rowStartIndex

addColumnWriter

isProjected

addColumn

addColumn

tupleSchema

size

nullable

setNull

startWrite

startRow

endArrayValue

restartRow

saveRow

preRollover

postRollover

endWrite

copy

column

column

set

setObject

scalar

scalar

tuple

tuple

array

array

variant

variant

dict

dict

type

type

isProjected

lastWriteIndex

writeIndex

bindListener

listener

bindListener

dump