Class ObjectArrayWriter

All Implemented Interfaces:
ArrayWriter, ColumnWriter, WriterEvents, WriterPosition
Direct Known Subclasses:
ListWriterImpl, ObjectDictWriter, RepeatedListWriter

public class ObjectArrayWriter extends AbstractArrayWriter.BaseArrayWriter
Writer for an array of either a map or another array. Here, the contents are a structure and need explicit saves. State transitions in addition to the base class are:
Public APIArray Event Offset EventElement Event
save() (array) saveValue() saveValue() saveValue()
This class is use for arrays of maps (and for arrays of arrays). When used with a map, then we have a single offset vector pointing into a group of arrays. Consider the simple case of a map of three scalars. Here, we have a hybrid of the states discussed for the BaseScalarWriter and those discussed for OffsetVectorWriterImpl. That is, the offset vector points into one map element. The individual elements can we Behind, Written or Unwritten, depending on the specific actions taken by the client.

For example:


   Offset Vector      Vector A     Vector B    Vector C       Index
       |    |   + - >   |X| < lwa    |Y|         |Z|            8
  lw > |  8 | - +       | |          |Y|         |Z|            9
   v > | 10 | - - - >   | |          |Y|         |Z|           10
       |    |           | |          |Y| < lwb   |Z|           11
       |    |      v' > | |          | |         |Z| < lwc     12
 
In the above:
  • The last write index, lw, for the current row points to the previous start position. (Recall that finishing the row writes the end position into the entry for the next row.
  • The top-level vector index, v, points to start position of the current row, which is offset 10 in all three data vectors.
  • The current array write position, v', is for the third element of the array that starts at position 10.
  • Since the row is active, the end position of the row has not yet been written, and so is blank in the offset vector.
  • The previous row had a two-element map array written, starting at offset 8 and ending at offset 9 (inclusive), identified as writing the next start offset (exclusive) into the following offset array slot.
  • Column A has not had data written since the first element of the previous row. It is currently in the Behind state with the last write position for A, lwa, pointing to the last write.
  • Column B is in the Unwritten state. A value was written for previous element in the map array, but not for the current element. We see this by the fact that the last write position for B, lwb, is one behind v'.
  • Column C has been written for the current array element and is in the Written state, with the last write position, lwc, pointing to the same location as v'.
Suppose we now write to Vector A and end the row:

   Offset Vector      Vector A     Vector B    Vector C       Index
       |    |   + - >   |X|          |Y|         |Z|            8
       |  8 | - +       |0|          |Y|         |Z|            9
  lw > | 10 | - - - >   |0|          |Y|         |Z|           10
   v > | 13 | - +       |0|          |Y| < lwb   |Z|           11
       |    |   |       |X| < lwa    | |         |Z| < lwc     12
       |    |   + - >   | |          | |         | | < v'      13
 
Here:
  • Vector A has been back-filled and the last write index advanced.
  • Vector B is now in the Behind state. Vectors A and B are in the Unwritten state.
  • The end position has been written to the offset vector, the offset vector last write position has been advance, and the top-level vector offset has advanced.
All this happens automatically as part of the indexing mechanisms. The key reason to understand this flow is to understand what happens in vector overflow: unlike an array of scalars, in which the data vector can never be in the Behind state, when we have an array of maps then each vector can be in any of the scalar writer states.
  • Constructor Details

  • Method Details

    • save

      public void save()
      Description copied from interface: ArrayWriter
      When the array contains a tuple or an array, call save() after each array value. Not necessary when writing scalars; each set operation calls save automatically.
    • setObject

      public void setObject(Object array)
      Description copied from interface: ColumnWriter
      Generic technique to write data as a generic Java object. The type of the object must match the target writer. Primarily for testing.
      • Scalar: The type of the Java object must match the type of the target vector. String or byte[] can be used for Varchar vectors.
      • Array: Write the array given an array of values. The object must be a Java array. The type of the array must match the type of element in the repeated vector. That is, if the vector is a Repeated Int, provide an int[] array.
      • Tuple (map or row): The Java object must be an array of objects in which the members of the array have a 1:1 correspondence with the members of the tuple in the order defined by the writer metadata. That is, if the map is (Int, Varchar), provide a Object[] array like this: {10, "fred"}.
      • Union: Uses the Java object type to determine the type of the backing vector. Creates a vector of the required type if needed.
      Parameters:
      array - value to write to the vector. The Java type of the object indicates the Drill storage type