All Superinterfaces:: Iterable<VectorWrapper<?>>, VectorAccessible

All Known Subinterfaces:: CloseableRecordBatch

All Known Implementing Classes:: AbstractBinaryRecordBatch, AbstractHashBinaryRecordBatch, AbstractRecordBatch, AbstractSingleRecordBatch, AbstractTableFunctionRecordBatch, AbstractUnaryRecordBatch, DrillRecordReader, ExternalSortBatch, FilterRecordBatch, FlattenRecordBatch, HashAggBatch, HashJoinBatch, HashSetOpRecordBatch, InsertWriterRecordBatch, IteratorValidatorBatchIterator, LateralJoinBatch, LimitRecordBatch, MergeJoinBatch, MergingRecordBatch, MetadataControllerBatch, MetadataHandlerBatch, MetadataHashAggBatch, MetadataStreamAggBatch, NestedLoopJoinBatch, OperatorRecordBatch, OrderedPartitionRecordBatch, PartitionLimitRecordBatch, ProducerConsumerBatch, ProjectRecordBatch, RangePartitionRecordBatch, RemovingRecordBatch, RowKeyJoinBatch, RuntimeFilterRecordBatch, ScanBatch, SchemalessBatch, SimpleRecordBatch, SortBatch, SpilledRecordBatch, StatisticsAggBatch, StatisticsMergeBatch, StatisticsWriterRecordBatch, StreamingAggBatch, TopNBatch, TopNBatch.SimpleSV4RecordBatch, TraceRecordBatch, UnionAllRecordBatch, UnnestRecordBatch, UnorderedReceiverBatch, UnpivotMapsRecordBatch, WindowFrameRecordBatch, WriterRecordBatch

public interface RecordBatch extends VectorAccessible

A record batch contains a set of field values for a particular range of records.

In the case of a record batch composed of ValueVectors, ideally a batch fits within L2 cache (~256kB per core). The set of value vectors does not change except during a call to next() that returns RecordBatch.IterOutcome.OK_NEW_SCHEMA value.

A key thing to know is that the Iterator provided by a record batch must align with the rank positions of the field IDs provided using getValueVectorId(org.apache.drill.common.expression.SchemaPath).

Nested Class Summary

Nested Classes

Modifier and Type

Interface

Description

static enum

RecordBatch.IterOutcome

Describes the outcome of incrementing RecordBatch forward by a call to next().
Field Summary

Fields

Modifier and Type

Field

Description

static final int

MAX_BATCH_ROW_COUNT

max num of rows in a batch, limited by 2-byte length in SV2: 65536 = 2^16
Method Summary

Modifier and Type

Method

Description

void

cancel()

Informs child operators that no more data is needed.

void

dump()

Perform dump of this batch's state to logs.

VectorContainer

getContainer()

Return the internal vector container

FragmentContext

getContext()

Gets the FragmentContext of the current query fragment.

VectorContainer

getOutgoingContainer()

BatchSchema

getSchema()

Gets the current schema of this record batch.

VectorWrapper<?>

getValueAccessorById(Class<?> clazz, int... ids)

TypedFieldId

getValueVectorId(SchemaPath path)

Gets the value vector type and ID for the given schema path.

WritableBatch

getWritableBatch()

Gets a writable version of this batch.

RecordBatch.IterOutcome

next()

Updates the data in each Field reading interface for the next range of records.

Methods inherited from interface java.lang.Iterable
forEach, iterator, spliterator

Methods inherited from interface org.apache.drill.exec.record.VectorAccessible
getRecordCount, getSelectionVector2, getSelectionVector4

Field Details
- MAX_BATCH_ROW_COUNT
  
  static final int MAX_BATCH_ROW_COUNT
  
  max num of rows in a batch, limited by 2-byte length in SV2: 65536 = 2^16
  See Also:
  
  Constant Field Values
Method Details
- getContext
  
  FragmentContext getContext()
  
  Gets the FragmentContext of the current query fragment. Useful for reporting failure information or other query-level information.
- getSchema
  
  BatchSchema getSchema()
  
  Gets the current schema of this record batch.
  May be called only when the most recent call to next(), if any, returned RecordBatch.IterOutcome.OK_NEW_SCHEMA or RecordBatch.IterOutcome.OK.
  
  The schema changes when and only when next() returns RecordBatch.IterOutcome.OK_NEW_SCHEMA.
  
  Specified by:
  
  getSchema in interface VectorAccessible
  
  Returns:
  
  schema of the current batch
- cancel
  
  void cancel()
  
  Informs child operators that no more data is needed. Only called for "normal" cancellation to avoid unnecessary compute in any worker threads. For the error case, the fragment executor will call close() on each child automatically.
  The operator which triggers the cancel MUST send a NONE status downstream, or throw an exception. It is not legal to call next() on an operator after calling its cancel() method.
- getOutgoingContainer
  
  VectorContainer getOutgoingContainer()
- getContainer
  
  VectorContainer getContainer()
  
  Return the internal vector container
  
  Returns:
  
  The internal vector container
- getValueVectorId
  
  TypedFieldId getValueVectorId(SchemaPath path)
  
  Gets the value vector type and ID for the given schema path. The TypedFieldId should store a fieldId which is the same as the ordinal position of the field within the Iterator provided this class's implementation of Iterable<ValueVector>.
  
  Specified by:
  
  getValueVectorId in interface VectorAccessible
  
  Parameters:
  
  path - The path where the vector should be located.
  
  Returns:
  
  The local field id associated with this vector. If no field matches this path, this will return a null TypedFieldId
- getValueAccessorById
  
  VectorWrapper<?> getValueAccessorById(Class<?> clazz, int... ids)
  
  Specified by:
  
  getValueAccessorById in interface VectorAccessible
- next
  
  RecordBatch.IterOutcome next()
  
  Updates the data in each Field reading interface for the next range of records.
  Once a RecordBatch's next() has returned RecordBatch.IterOutcome.NONE or IterOutcome#STOP, the consumer should no longer call next(). Behavior at this point is undefined and likely to throw an exception.
  
  See RecordBatch.IterOutcome for the protocol (possible sequences of return values).
  
  Returns:
  
  An IterOutcome describing the result of the iteration.
- getWritableBatch
  
  WritableBatch getWritableBatch()
  
  Gets a writable version of this batch. Takes over ownership of existing buffers.
- dump
  
  void dump()
  
  Perform dump of this batch's state to logs.

Interface RecordBatch

Nested Class Summary

Field Summary

Method Summary

Methods inherited from interface java.lang.Iterable

Methods inherited from interface org.apache.drill.exec.record.VectorAccessible

Field Details

MAX_BATCH_ROW_COUNT

Method Details

getContext

getSchema

cancel

getOutgoingContainer

getContainer

getValueVectorId

getValueAccessorById

next

getWritableBatch

dump