public abstract class BatchGroup extends Object implements VectorAccessible, AutoCloseable
The batches are defined by a schema which can change over time. When the schema changes, all existing and new batches are coerced into the new schema. Provides a uniform way to iterate over records for one or more batches whether the batches are in memory or on disk.
The BatchGroup
operates in two modes as given by the two
subclasses:
InputBatch
: Used to buffer in-memory batches
prior to spilling.SpilledRun
: Holds a "memento" to a set
of batches written to disk. Acts as both a reader and writer for
those batches.Modifier and Type | Field and Description |
---|---|
protected BufferAllocator |
allocator |
protected VectorContainer |
currentContainer |
protected int |
mergeIndex
This class acts as both "holder" for a vector container and an iterator
into that container when the sort enters the merge phase.
|
protected BatchSchema |
schema |
Constructor and Description |
---|
BatchGroup(VectorContainer container,
BufferAllocator allocator) |
Modifier and Type | Method and Description |
---|---|
void |
close() |
static void |
closeAll(Collection<? extends BatchGroup> groups) |
VectorContainer |
getContainer() |
int |
getNextIndex() |
int |
getRecordCount()
Get the number of records.
|
BatchSchema |
getSchema()
Get the schema of the current RecordBatch.
|
SelectionVector2 |
getSelectionVector2() |
SelectionVector4 |
getSelectionVector4() |
int |
getUnfilteredRecordCount() |
VectorWrapper<?> |
getValueAccessorById(Class<?> clazz,
int... ids) |
TypedFieldId |
getValueVectorId(SchemaPath path)
Get the value vector type and id for the given schema path.
|
Iterator<VectorWrapper<?>> |
iterator() |
void |
setSchema(BatchSchema schema)
Updates the schema for this batch group.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
forEach, spliterator
protected final BufferAllocator allocator
protected VectorContainer currentContainer
protected int mergeIndex
protected BatchSchema schema
public BatchGroup(VectorContainer container, BufferAllocator allocator)
public void setSchema(BatchSchema schema)
schema
- public int getNextIndex()
public VectorContainer getContainer()
public void close() throws IOException
close
in interface AutoCloseable
IOException
public VectorWrapper<?> getValueAccessorById(Class<?> clazz, int... ids)
getValueAccessorById
in interface VectorAccessible
public TypedFieldId getValueVectorId(SchemaPath path)
VectorAccessible
Iterable<ValueVector>.
getValueVectorId
in interface VectorAccessible
path
- the path where the vector should be located.public BatchSchema getSchema()
VectorAccessible
getSchema
in interface VectorAccessible
public int getRecordCount()
VectorAccessible
getRecordCount
in interface VectorAccessible
public int getUnfilteredRecordCount()
public Iterator<VectorWrapper<?>> iterator()
iterator
in interface Iterable<VectorWrapper<?>>
public SelectionVector2 getSelectionVector2()
getSelectionVector2
in interface VectorAccessible
public SelectionVector4 getSelectionVector4()
getSelectionVector4
in interface VectorAccessible
public static void closeAll(Collection<? extends BatchGroup> groups)
Copyright © 1970 The Apache Software Foundation. All rights reserved.