Interface RecordBatch

All Superinterfaces:
Iterable<VectorWrapper<?>>, VectorAccessible
All Known Subinterfaces:
CloseableRecordBatch
All Known Implementing Classes:
AbstractBinaryRecordBatch, AbstractHashBinaryRecordBatch, AbstractRecordBatch, AbstractSingleRecordBatch, AbstractTableFunctionRecordBatch, AbstractUnaryRecordBatch, DrillRecordReader, ExternalSortBatch, FilterRecordBatch, FlattenRecordBatch, HashAggBatch, HashJoinBatch, HashSetOpRecordBatch, InsertWriterRecordBatch, IteratorValidatorBatchIterator, LateralJoinBatch, LimitRecordBatch, MergeJoinBatch, MergingRecordBatch, MetadataControllerBatch, MetadataHandlerBatch, MetadataHashAggBatch, MetadataStreamAggBatch, NestedLoopJoinBatch, OperatorRecordBatch, OrderedPartitionRecordBatch, PartitionLimitRecordBatch, ProducerConsumerBatch, ProjectRecordBatch, RangePartitionRecordBatch, RemovingRecordBatch, RowKeyJoinBatch, RuntimeFilterRecordBatch, ScanBatch, SchemalessBatch, SimpleRecordBatch, SortBatch, SpilledRecordBatch, StatisticsAggBatch, StatisticsMergeBatch, StatisticsWriterRecordBatch, StreamingAggBatch, TopNBatch, TopNBatch.SimpleSV4RecordBatch, TraceRecordBatch, UnionAllRecordBatch, UnnestRecordBatch, UnorderedReceiverBatch, UnpivotMapsRecordBatch, WindowFrameRecordBatch, WriterRecordBatch

public interface RecordBatch extends VectorAccessible
A record batch contains a set of field values for a particular range of records.

In the case of a record batch composed of ValueVectors, ideally a batch fits within L2 cache (~256kB per core). The set of value vectors does not change except during a call to next() that returns RecordBatch.IterOutcome.OK_NEW_SCHEMA value.

A key thing to know is that the Iterator provided by a record batch must align with the rank positions of the field IDs provided using getValueVectorId(org.apache.drill.common.expression.SchemaPath).

  • Field Details

    • MAX_BATCH_ROW_COUNT

      static final int MAX_BATCH_ROW_COUNT
      max num of rows in a batch, limited by 2-byte length in SV2: 65536 = 2^16
      See Also:
  • Method Details

    • getContext

      FragmentContext getContext()
      Gets the FragmentContext of the current query fragment. Useful for reporting failure information or other query-level information.
    • getSchema

      BatchSchema getSchema()
      Gets the current schema of this record batch.

      May be called only when the most recent call to next(), if any, returned RecordBatch.IterOutcome.OK_NEW_SCHEMA or RecordBatch.IterOutcome.OK.

      The schema changes when and only when next() returns RecordBatch.IterOutcome.OK_NEW_SCHEMA.

      Specified by:
      getSchema in interface VectorAccessible
      Returns:
      schema of the current batch
    • cancel

      void cancel()
      Informs child operators that no more data is needed. Only called for "normal" cancellation to avoid unnecessary compute in any worker threads. For the error case, the fragment executor will call close() on each child automatically.

      The operator which triggers the cancel MUST send a NONE status downstream, or throw an exception. It is not legal to call next() on an operator after calling its cancel() method.

    • getOutgoingContainer

      VectorContainer getOutgoingContainer()
    • getContainer

      VectorContainer getContainer()
      Return the internal vector container
      Returns:
      The internal vector container
    • getValueVectorId

      TypedFieldId getValueVectorId(SchemaPath path)
      Gets the value vector type and ID for the given schema path. The TypedFieldId should store a fieldId which is the same as the ordinal position of the field within the Iterator provided this class's implementation of Iterable<ValueVector>.
      Specified by:
      getValueVectorId in interface VectorAccessible
      Parameters:
      path - The path where the vector should be located.
      Returns:
      The local field id associated with this vector. If no field matches this path, this will return a null TypedFieldId
    • getValueAccessorById

      VectorWrapper<?> getValueAccessorById(Class<?> clazz, int... ids)
      Specified by:
      getValueAccessorById in interface VectorAccessible
    • next

      Updates the data in each Field reading interface for the next range of records.

      Once a RecordBatch's next() has returned RecordBatch.IterOutcome.NONE or IterOutcome#STOP, the consumer should no longer call next(). Behavior at this point is undefined and likely to throw an exception.

      See RecordBatch.IterOutcome for the protocol (possible sequences of return values).

      Returns:
      An IterOutcome describing the result of the iteration.
    • getWritableBatch

      WritableBatch getWritableBatch()
      Gets a writable version of this batch. Takes over ownership of existing buffers.
    • dump

      void dump()
      Perform dump of this batch's state to logs.