Class AbstractHashBinaryRecordBatch<T extends PhysicalOperator>

java.lang.Object
org.apache.drill.exec.record.AbstractRecordBatch<T>
org.apache.drill.exec.record.AbstractBinaryRecordBatch<T>
org.apache.drill.exec.physical.impl.join.AbstractHashBinaryRecordBatch<T>
All Implemented Interfaces:
AutoCloseable, Iterable<VectorWrapper<?>>, CloseableRecordBatch, RecordBatch, VectorAccessible
Direct Known Subclasses:
HashJoinBatch, HashSetOpRecordBatch

public abstract class AbstractHashBinaryRecordBatch<T extends PhysicalOperator> extends AbstractBinaryRecordBatch<T>
Base class for the runtime execution implementation of the Hash-Join and Hash-SetOp operator

This implementation splits the incoming Build side rows into multiple Partitions, thus allowing spilling of some of these partitions to disk if memory gets tight. Each partition is implemented as a HashPartition. After the build phase is over, in the most general case, some partitions were spilled, and the others are in memory. Each of the partitions in memory would get a HashTable built.

Next the Probe side is read, and each row is key matched with a Build partition. If that partition is in memory, then the key is used to probe and perform the operation, and the results are added to the outgoing batch. But if that build side partition was spilled, then the matching Probe size partition is spilled as well.

After all the Probe side was processed, we are left with pairs of spilled partitions. Then each pair is processed individually (that Build partition should be smaller than the original, hence likely fit whole into memory to allow probing; if not -- see below).

Processing of each spilled pair is EXACTLY like processing the original Build/Probe incomings. (As a fact, the innerNext() method calls itself recursively !!). Thus the spilled build partition is read and divided into new partitions, which in turn may spill again (and again...). The code tracks these spilling "cycles". Normally any such "again" (i.e. cycle of 2 or greater) is a waste, indicating that the number of partitions chosen was too small.

  • Field Details

    • logger

      protected final org.slf4j.Logger logger
    • semiJoin

      protected boolean semiJoin
    • joinIsLeftOrFull

      protected boolean joinIsLeftOrFull
    • joinIsRightOrFull

      protected boolean joinIsRightOrFull
    • isRowKeyJoin

      protected boolean isRowKeyJoin
    • enableRuntimeFilter

      protected boolean enableRuntimeFilter
    • runtimeFilterDef

      protected RuntimeFilterDef runtimeFilterDef
    • rightExpr

      protected List<NamedExpression> rightExpr
    • buildJoinColumns

      protected Set<String> buildJoinColumns
      Names of the join columns. This names are used in order to help estimate the size of the HashTables.
    • skipHashTableBuild

      protected boolean skipHashTableBuild
    • RECORDS_PER_BATCH

      protected final int RECORDS_PER_BATCH
      The maximum number of records within each internal batch.
    • rkJoinState

      protected RowKeyJoin.RowKeyJoinState rkJoinState
    • probe

      protected Probe probe
    • numPartitions

      protected int numPartitions
      The number of HashPartitions. This is configured via a system option and set in partitionNumTuning(int, HashJoinMemoryCalculator.BuildSidePartitioning).
    • baseHashTable

      protected ChainedHashTable baseHashTable
      The master class used to generate HashTables.
    • buildSideIsEmpty

      protected final org.apache.commons.lang3.mutable.MutableBoolean buildSideIsEmpty
    • probeSideIsEmpty

      protected final org.apache.commons.lang3.mutable.MutableBoolean probeSideIsEmpty
    • canSpill

      protected boolean canSpill
    • wasKilled

      protected boolean wasKilled
    • partitions

      protected HashPartition[] partitions
      This array holds the currently active HashPartitions.
    • outputRecords

      protected int outputRecords
    • buildSchema

      protected BatchSchema buildSchema
    • probeSchema

      protected BatchSchema probeSchema
    • buildComplete

      protected boolean buildComplete
    • firstOutputBatch

      protected boolean firstOutputBatch
    • rightHVColPosition

      protected int rightHVColPosition
    • allocator

      protected final BufferAllocator allocator
    • buildBatch

      protected RecordBatch buildBatch
    • probeBatch

      protected RecordBatch probeBatch
    • prefetchedBuild

      protected final org.apache.commons.lang3.mutable.MutableBoolean prefetchedBuild
      Flag indicating whether or not the first data holding build batch needs to be fetched.
    • prefetchedProbe

      protected final org.apache.commons.lang3.mutable.MutableBoolean prefetchedProbe
      Flag indicating whether or not the first data holding probe batch needs to be fetched.
    • spillSet

      protected final SpillSet spillSet
    • originalPartition

      protected final int originalPartition
      See Also:
    • maxBatchesInMemory

      protected final int maxBatchesInMemory
    • probeFields

      protected final List<String> probeFields
    • runtimeFilterReporter

      protected RuntimeFilterReporter runtimeFilterReporter
    • hash64

      protected ValueVectorHashHelper.Hash64 hash64
    • bloomFilter2buildId

      protected final Map<BloomFilter,Integer> bloomFilter2buildId
    • bloomFilterDef2buildId

      protected final Map<BloomFilterDef,Integer> bloomFilterDef2buildId
    • bloomFilters

      protected final List<BloomFilter> bloomFilters
    • bloomFiltersGenerated

      protected boolean bloomFiltersGenerated
    • spilledState

      Queue of spilled partitions to process.
    • spilledInners

  • Constructor Details

  • Method Details