public class OffsetVectorWriterImpl extends AbstractFixedWidthWriter implements OffsetVectorWriter
Note that the lastWriteIndex tracked here corresponds to the data values; it is one less than the actual offset vector last write index due to the nature of offset vector layouts. The selection of last write index basis makes roll-over processing easier as only this writer need know about the +1 translation required for writing.
The states illustrated in the base class apply here as well, remembering that the end offset for a row (or array position) is written one ahead of the vector index.
The vector index does create an interesting dynamic for the child writers. From the child writer's perspective, the states described in the super class are the only states of interest. Here we want to take the perspective of the parent.
The offset vector is an implementation of a repeat level. A repeat level can occur for a single array, or for a collection of columns within a repeated map. (A repeat level also occurs for variable-width fields, but this is a bit harder to see, so let's ignore that for now.)
The key point to realize is that each repeat level introduces an isolation level in terms of indexing. That is, empty values in the outer level have no affect on indexing in the inner level. In fact, the nature of a repeated outer level means that there are no empties in the inner level.
To illustrate:
Offset Vector Data Vector Indexes
lw, v > | 10 | - - - - - > | X | 10
| 12 | - - + | X | < lw' 11
| | + - - > | | < v' 12
In the above, the client has just written an array of two elements
at the current write position. The data starts at offset 10 in
the data vector, and the next write will be at 12. The end offset
is written one ahead of the vector index.
From the data vector's perspective, its last-write (lw') reflects the last element written. If this is an array of scalars, then the write index is automatically incremented, as illustrated by v'. (For map arrays, the index must be incremented by calling save() on the map array writer.)
Suppose the client now skips some arrays:
Offset Vector Data Vector
lw > | 10 | - - - - - > | X | 10
| 12 | - - + | X | < lw' 11
| | + - - > | | < v' 12
| | | | 13
v > | | | | 14
The last write position does not move and there are gaps in the
offset vector. The vector index points to the current row. Note
that the data vector last write and vector indexes do not change,
this reflects the fact that the the data vector's vector index
(v') matches the tail offset
The client now writes a three-element vector:
Offset Vector Data Vector
| 10 | - - - - - > | X | 10
| 12 | - - + | X | 11
| 12 | - - + - - > | Y | 12
| 12 | - - + | Y | 13
lw, v > | 12 | - - + | Y | < lw' 14
| 15 | - - - - - > | | < v' 15
Quite a bit just happened. The empty offset slots were back-filled
with the last write offset in the data vector. The client wrote
three values, which advanced the last write and vector indexes
in the data vector. And, the last write index in the offset
vector also moved to reflect the update of the offset vector.
Note that as a result, multiple positions in the offset vector
point to the same location in the data vector. This is fine; we
compute the number of entries as the difference between two successive
offset vector positions, so the empty positions have become 0-length
arrays.
Note that, for an array of scalars, when overflow occurs, we need only worry about two states in the data vector. Either data has been written for the row (as in the third example above), and so must be moved to the roll-over vector, or no data has been written and no move is needed. We never have to worry about missing values because the cannot occur in the data vector.
See ObjectArrayWriter
for information about arrays of
maps (arrays of multiple columns.)
The second way to fill empties is in the data vector. The data vector may choose
to fill the four "empty" slots with a value, say "X". In this case, it is up to
the data vector to fill in the values, calling into this vector to set each
offset. Note that when doing this, the calls are a bit different than for writing
a regular value because we want to write at the "last write position", not the
current row position. See BaseVarWidthWriter
for an example.
AbstractFixedWidthWriter.BaseFixedWidthWriter, AbstractFixedWidthWriter.BaseIntWriter
AbstractScalarWriterImpl.ScalarObjectWriter
WriterEvents.ColumnWriterListener, WriterEvents.State
Modifier and Type | Field and Description |
---|---|
protected int |
nextOffset
Cached value of the end offset for the current value.
|
lastWriteIndex
capacity, drillBuf, emptyValue, listener, MIN_BUFFER_SIZE
schema, vectorIndex
Constructor and Description |
---|
OffsetVectorWriterImpl(UInt4Vector vector) |
Modifier and Type | Method and Description |
---|---|
void |
copy(ColumnReader from)
Copy a single value from the given reader, which must be of the
same type as this writer.
|
void |
dump(HierarchicalFormatter format) |
protected void |
fillEmpties(int fillCount) |
void |
fillOffset(int newOffset) |
int |
nextOffset() |
void |
postRollover()
The vectors backing this writer rolled over.
|
int |
prepareFill() |
protected int |
prepareWrite()
Return the write offset, which is one greater than the index reported
by the vector index.
|
void |
preRollover()
The vectors backing this vector are about to roll over.
|
protected void |
realloc(int size) |
void |
restartRow()
During a writer to a row, rewind the the current index position to
restart the row.
|
void |
reviseOffset(int newOffset) |
int |
rowStartOffset() |
void |
setDefaultValue(Object value)
Set the default value to be used to fill empties for this writer.
|
void |
setNextOffset(int newOffset) |
void |
setValue(Object value)
Write value to a vector as a Java object of the "native" type for
the column.
|
void |
setValueCount(int valueCount) |
void |
skipNulls() |
void |
startRow()
Start a new row.
|
void |
startWrite()
Start a write (batch) operation.
|
ValueType |
valueType()
Describe the type of the value.
|
BaseDataValueVector |
vector() |
int |
width() |
endWrite, lastWriteIndex, mandatoryResize, resize, setBuffer, setLastWriteIndex
appendBytes, bindListener, bindSchema, canExpand, nullable, overflowed, setBoolean, setBytes, setDate, setDecimal, setDouble, setFloat, setInt, setLong, setNull, setPeriod, setString, setTime, setTimestamp
bindIndex, endArrayValue, isProjected, rowStartIndex, saveRow, schema, type, writeIndex
conversionError, extendedType, setObject, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
extendedType
isProjected, nullable, schema, setNull, setObject, type
appendBytes, setBoolean, setBytes, setDate, setDecimal, setDouble, setFloat, setInt, setLong, setNull, setPeriod, setString, setTime, setTimestamp
bindIndex, bindListener, endArrayValue, endWrite, saveRow
lastWriteIndex, rowStartIndex, writeIndex
protected int nextOffset
@endValue()
.public OffsetVectorWriterImpl(UInt4Vector vector)
public BaseDataValueVector vector()
vector
in class AbstractScalarWriterImpl
public int width()
width
in class AbstractFixedWidthWriter
protected void realloc(int size)
realloc
in class BaseScalarWriter
public ValueType valueType()
ScalarWriter
valueType
in interface ScalarWriter
public void startWrite()
WriterEvents
startWrite
in interface WriterEvents
startWrite
in class AbstractFixedWidthWriter
public int nextOffset()
nextOffset
in interface OffsetVectorWriter
public int rowStartOffset()
rowStartOffset
in interface OffsetVectorWriter
public void startRow()
WriterEvents
WriterEvents.restartRow()
instead.startRow
in interface WriterEvents
startRow
in class AbstractScalarWriterImpl
protected final int prepareWrite()
public final int prepareFill()
protected final void fillEmpties(int fillCount)
fillEmpties
in class AbstractFixedWidthWriter
public final void setNextOffset(int newOffset)
setNextOffset
in interface OffsetVectorWriter
public final void reviseOffset(int newOffset)
public final void fillOffset(int newOffset)
public final void setValue(Object value)
ValueWriter
Primarily to be used when the code already knows the object type.
setValue
in interface ValueWriter
value
- a value that matches the primary setter above, or null
to set the column to nullfor the generic case
public void skipNulls()
skipNulls
in class AbstractFixedWidthWriter
public void restartRow()
WriterEvents
restartRow
in interface WriterEvents
restartRow
in class AbstractFixedWidthWriter
public void preRollover()
WriterEvents
preRollover
in interface WriterEvents
preRollover
in class AbstractFixedWidthWriter
public void postRollover()
WriterEvents
postRollover
in interface WriterEvents
postRollover
in class AbstractFixedWidthWriter
public void setValueCount(int valueCount)
setValueCount
in class AbstractFixedWidthWriter
public void dump(HierarchicalFormatter format)
dump
in interface OffsetVectorWriter
dump
in interface WriterEvents
dump
in class AbstractFixedWidthWriter
public void setDefaultValue(Object value)
ScalarWriter
setDefaultValue
in interface ScalarWriter
value
- the value to set. Cannot be null. The type of the value
must match that legal for ValueWriter.setValue(Object)
public void copy(ColumnReader from)
ColumnWriter
copy
in interface ColumnWriter
from
- reader to provide the dataCopyright © 1970 The Apache Software Foundation. All rights reserved.