Class AbstractArrayWriter

All Implemented Interfaces:
ArrayWriter, ColumnWriter, WriterEvents, WriterPosition
Direct Known Subclasses:
AbstractArrayWriter.BaseArrayWriter, DummyArrayWriter

public abstract class AbstractArrayWriter extends Object implements ArrayWriter, WriterEvents
Writer for an array-valued column. This writer appends values: once a value is written, it cannot be changed. As a result, writer methods have no item index; each set advances the array to the next position.

This class represents the array as a whole. In practice that means building the offset vector. The array is associated with an element object that manages writing to the scalar, array or tuple that is the array element. Note that this representation makes little use of the methods in the "Repeated" vector class: instead it works directly with the offset and element vectors.

An array has a one-to-many relationship with its children. Starting an array prepares for writing the first element. Each element must be saved by calling endValue(). This is done automatically for scalars (since there is exactly one value per element), but must be done via the client code for arrays of arrays or tuples. Valid state transitions:

Public APIArray EventOffset EventElement Event
startBatch() startWrite() startWrite() startWrite()
start() (new row) startRow() startRow() startRow()
start() (without save) restartRow() restartRow() restartRow()
save() (array) saveValue() saveValue() saveValue()
save() (row) See subclasses.
harvest() endWrite() endWrite() endWrite()
Some items to note:
  • Batch and row events are passed to the element.
  • Each element is saved via a call to on the array. Without this call, the element value is discarded. This is necessary because the array always has an active element: no "startElement" method is necessary. This also means that any unsaved element values can be discarded simply by omitting a call to save().
  • Since elements must be saved individually, the call to WriterEvents.saveRow() does not call saveValue(). This is an important distinction between an array and a tuple.
  • The offset and element writers are treated equally: the same events are passed to both.
  • Field Details

  • Constructor Details

  • Method Details

    • bindListener

      public void bindListener(WriterEvents.ColumnWriterListener listener)
      Description copied from interface: WriterEvents
      Bind a listener to the underlying vector writer. This listener reports on vector events (overflow, growth), and so is called only when the writer is backed by a vector. The listener is ignored (and never called) for dummy (non-projected) columns. If the column is compound (such as for a nullable or repeated column, or for a map), then the writer is bound to the individual components.
      Specified by:
      bindListener in interface WriterEvents
      listener - the vector event listener to bind
    • type

      public ObjectType type()
      Description copied from interface: ColumnWriter
      Return the object (structure) type of this writer.
      Specified by:
      type in interface ColumnWriter
      type indicating if this is a scalar, tuple or array
    • entryType

      public ObjectType entryType()
      Description copied from interface: ArrayWriter
      Return a generic object writer for the array entry.
      Specified by:
      entryType in interface ArrayWriter
      generic object reader
    • schema

      public ColumnMetadata schema()
      Description copied from interface: ColumnWriter
      Returns the schema of the column associated with this writer.
      Specified by:
      schema in interface ColumnWriter
      schema for this writer's column
    • entry

      public ObjectWriter entry()
      Description copied from interface: ArrayWriter
      The object type of the list entry. All entries have the same type.
      Specified by:
      entry in interface ArrayWriter
      the object type of each entry
    • scalar

      public ScalarWriter scalar()
      Specified by:
      scalar in interface ArrayWriter
    • tuple

      public TupleWriter tuple()
      Specified by:
      tuple in interface ArrayWriter
    • array

      public ArrayWriter array()
      Specified by:
      array in interface ArrayWriter
    • variant

      public VariantWriter variant()
      Specified by:
      variant in interface ArrayWriter
    • dict

      public DictWriter dict()
      Specified by:
      dict in interface ArrayWriter
    • size

      public int size()
      Description copied from interface: ArrayWriter
      Number of elements written thus far to the array.
      Specified by:
      size in interface ArrayWriter
      the number of elements
    • nullable

      public boolean nullable()
      Description copied from interface: ColumnWriter
      Whether this writer allows nulls. This is not as simple as checking for the TypeProtos.DataMode.OPTIONAL type in the schema. List entries are nullable, if they are primitive, but not if they are maps or lists. Unions are nullable, regardless of cardinality.
      Specified by:
      nullable in interface ColumnWriter
      true if a call to ColumnWriter.setNull() is supported, false if not
    • isProjected

      public boolean isProjected()
      Description copied from interface: ColumnWriter
      Whether this writer is projected (is backed by a materialized vector), or is unprojected (is just a dummy writer.) In most cases, clients can ignore whether the column is projected and just write to the writer. This flag handles those special cases where it is helpful to know if the column is projected or not.
      Specified by:
      isProjected in interface ColumnWriter
    • setNull

      public void setNull()
      Description copied from interface: ColumnWriter
      Set the current value to null. Support depends on the underlying implementation: only nullable types support this operation. throws IllegalStateException if called on a non-nullable value.
      Specified by:
      setNull in interface ColumnWriter
    • rowStartIndex

      public int rowStartIndex()
      Description copied from interface: WriterPosition
      Position within the vector of the first value for the current row. Note that this is always the first value for the row, even for a writer deeply nested within a hierarchy of arrays. (The first position for the current array is not exposed in this API.)
      Specified by:
      rowStartIndex in interface WriterPosition
      the vector offset of the first value for the current row
    • lastWriteIndex

      public int lastWriteIndex()
      Description copied from interface: WriterPosition
      Return the last write position in the vector. This may be the same as the writer index position (if the vector was written at that point), or an earlier point. In either case, this value points to the last valid value in the vector.
      Specified by:
      lastWriteIndex in interface WriterPosition
      index of the last valid value in the vector
    • writeIndex

      public int writeIndex()
      Description copied from interface: WriterPosition
      Current write index for the writer. This is the global array location for arrays, same as the row index for top-level columns.
      Specified by:
      writeIndex in interface WriterPosition
      current write index
    • setNull

      public void setNull(boolean isNull)
      Specified by:
      setNull in interface ArrayWriter
    • copy

      public void copy(ColumnReader from)
      Description copied from interface: ColumnWriter
      Copy a single value from the given reader, which must be of the same type as this writer.
      Specified by:
      copy in interface ColumnWriter
      from - reader to provide the data
    • offsetWriter

      public OffsetVectorWriter offsetWriter()
      Return the writer for the offset vector for this array. Primarily used to handle overflow; other clients should not attempt to muck about with the offset vector directly.
      the writer for the offset vector associated with this array
    • dump

      public void dump(HierarchicalFormatter format)
      Specified by:
      dump in interface WriterEvents