Class AbstractHashBinaryRecordBatch<T extends PhysicalOperator>
- All Implemented Interfaces:
AutoCloseable
,Iterable<VectorWrapper<?>>
,CloseableRecordBatch
,RecordBatch
,VectorAccessible
- Direct Known Subclasses:
HashJoinBatch
,HashSetOpRecordBatch
This implementation splits the incoming Build side rows into multiple
Partitions, thus allowing spilling of some of these partitions to disk if
memory gets tight. Each partition is implemented as a HashPartition
.
After the build phase is over, in the most general case, some partitions
were spilled, and the others are in memory. Each of the partitions
in memory would get a HashTable
built.
Next the Probe side is read, and each row is key matched with a Build partition. If that partition is in memory, then the key is used to probe and perform the operation, and the results are added to the outgoing batch. But if that build side partition was spilled, then the matching Probe size partition is spilled as well.
After all the Probe side was processed, we are left with pairs of spilled partitions. Then each pair is processed individually (that Build partition should be smaller than the original, hence likely fit whole into memory to allow probing; if not -- see below).
Processing of each spilled pair is EXACTLY like processing the original
Build/Probe incomings. (As a fact, the innerNext()
method calls
itself recursively !!). Thus the spilled build partition is read and divided
into new partitions, which in turn may spill again (and again...). The code
tracks these spilling "cycles". Normally any such "again" (i.e. cycle of 2 or
greater) is a waste, indicating that the number of partitions chosen was too
small.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
static class
This holds information about the spilled partitions for the build and probe side.class
Nested classes/interfaces inherited from class org.apache.drill.exec.record.AbstractRecordBatch
AbstractRecordBatch.BatchState
Nested classes/interfaces inherited from interface org.apache.drill.exec.record.RecordBatch
RecordBatch.IterOutcome
-
Field Summary
Modifier and TypeFieldDescriptionprotected final BufferAllocator
protected ChainedHashTable
The master class used to generateHashTable
s.protected final Map<BloomFilter,
Integer> protected final Map<BloomFilterDef,
Integer> protected final List<BloomFilter>
protected boolean
protected RecordBatch
protected boolean
Names of the join columns.protected BatchSchema
protected final org.apache.commons.lang3.mutable.MutableBoolean
protected boolean
protected boolean
protected boolean
protected ValueVectorHashHelper.Hash64
protected boolean
protected boolean
protected boolean
protected final org.slf4j.Logger
protected final int
protected int
The number ofHashPartition
s.protected final int
protected int
protected HashPartition[]
This array holds the currently activeHashPartition
s.protected final org.apache.commons.lang3.mutable.MutableBoolean
Flag indicating whether or not the first data holding build batch needs to be fetched.protected final org.apache.commons.lang3.mutable.MutableBoolean
Flag indicating whether or not the first data holding probe batch needs to be fetched.protected Probe
protected RecordBatch
protected BatchSchema
protected final org.apache.commons.lang3.mutable.MutableBoolean
protected final int
The maximum number of records within each internal batch.protected List<NamedExpression>
protected int
protected RowKeyJoin.RowKeyJoinState
protected RuntimeFilterDef
protected RuntimeFilterReporter
protected boolean
protected boolean
protected AbstractHashBinaryRecordBatch.SpilledPartition[]
protected final SpilledState<AbstractHashBinaryRecordBatch.SpilledPartition>
Queue of spilled partitions to process.protected final SpillSet
protected boolean
Fields inherited from class org.apache.drill.exec.record.AbstractBinaryRecordBatch
batchMemoryManager, left, LEFT_INDEX, leftUpstream, numInputs, right, RIGHT_INDEX, rightUpstream
Fields inherited from class org.apache.drill.exec.record.AbstractRecordBatch
batchStatsContext, container, context, oContext, state, stats, unionTypeEnabled
Fields inherited from interface org.apache.drill.exec.record.RecordBatch
MAX_BATCH_ROW_COUNT
-
Constructor Summary
ConstructorDescriptionAbstractHashBinaryRecordBatch
(T popConfig, FragmentContext context, RecordBatch left, RecordBatch right) The constructor -
Method Summary
Modifier and TypeMethodDescriptionprotected abstract HashTableConfig
protected void
protected void
void
close()
abstract Probe
Execute the BUILD phase; first read incoming and split rows into partitions; may decide to spill some of the partitionsDetermines the memory calculator to use.int
Get the number of records.boolean
isSpilledInner
(int part) This creates a string that summarizes the memory usage of the operator.abstract void
void
Methods inherited from class org.apache.drill.exec.record.AbstractBinaryRecordBatch
checkForEarlyFinish, getBatchMemoryManager, prefetchFirstBatchFromBothSides, updateBatchMemoryManagerStats, verifyOutcomeToSetBatchState
Methods inherited from class org.apache.drill.exec.record.AbstractRecordBatch
cancel, checkContinue, getContainer, getContext, getOutgoingContainer, getPopConfig, getRecordBatchStatsContext, getSchema, getSelectionVector2, getSelectionVector4, getValueAccessorById, getValueVectorId, getWritableBatch, isRecordBatchStatsLoggingEnabled, iterator, next, next, next, schemaChangeException, schemaChangeException
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
Methods inherited from interface org.apache.drill.exec.record.RecordBatch
dump
-
Field Details
-
logger
protected final org.slf4j.Logger logger -
semiJoin
protected boolean semiJoin -
joinIsLeftOrFull
protected boolean joinIsLeftOrFull -
joinIsRightOrFull
protected boolean joinIsRightOrFull -
isRowKeyJoin
protected boolean isRowKeyJoin -
enableRuntimeFilter
protected boolean enableRuntimeFilter -
runtimeFilterDef
-
rightExpr
-
buildJoinColumns
Names of the join columns. This names are used in order to help estimate the size of theHashTable
s. -
skipHashTableBuild
protected boolean skipHashTableBuild -
RECORDS_PER_BATCH
protected final int RECORDS_PER_BATCHThe maximum number of records within each internal batch. -
rkJoinState
-
probe
-
numPartitions
protected int numPartitionsThe number ofHashPartition
s. This is configured via a system option and set inpartitionNumTuning(int, HashJoinMemoryCalculator.BuildSidePartitioning)
. -
baseHashTable
The master class used to generateHashTable
s. -
buildSideIsEmpty
protected final org.apache.commons.lang3.mutable.MutableBoolean buildSideIsEmpty -
probeSideIsEmpty
protected final org.apache.commons.lang3.mutable.MutableBoolean probeSideIsEmpty -
canSpill
protected boolean canSpill -
wasKilled
protected boolean wasKilled -
partitions
This array holds the currently activeHashPartition
s. -
outputRecords
protected int outputRecords -
buildSchema
-
probeSchema
-
buildComplete
protected boolean buildComplete -
firstOutputBatch
protected boolean firstOutputBatch -
rightHVColPosition
protected int rightHVColPosition -
allocator
-
buildBatch
-
probeBatch
-
prefetchedBuild
protected final org.apache.commons.lang3.mutable.MutableBoolean prefetchedBuildFlag indicating whether or not the first data holding build batch needs to be fetched. -
prefetchedProbe
protected final org.apache.commons.lang3.mutable.MutableBoolean prefetchedProbeFlag indicating whether or not the first data holding probe batch needs to be fetched. -
spillSet
-
originalPartition
protected final int originalPartition- See Also:
-
maxBatchesInMemory
protected final int maxBatchesInMemory -
probeFields
-
runtimeFilterReporter
-
hash64
-
bloomFilter2buildId
-
bloomFilterDef2buildId
-
bloomFilters
-
bloomFiltersGenerated
protected boolean bloomFiltersGenerated -
spilledState
Queue of spilled partitions to process. -
spilledInners
-
-
Constructor Details
-
AbstractHashBinaryRecordBatch
public AbstractHashBinaryRecordBatch(T popConfig, FragmentContext context, RecordBatch left, RecordBatch right) throws OutOfMemoryException The constructor- Parameters:
popConfig
- Tcontext
- FragmentContextleft
- probe/outer side incoming inputright
- build/iner side incoming input- Throws:
OutOfMemoryException
- out of memory exception
-
-
Method Details
-
getRecordCount
public int getRecordCount()Description copied from interface:VectorAccessible
Get the number of records.- Returns:
- number of records
-
buildSchema
protected void buildSchema()- Overrides:
buildSchema
in classAbstractRecordBatch<T extends PhysicalOperator>
-
getCalculatorImpl
Determines the memory calculator to use. If maxNumBatches is configured simple batch counting is used to spill. Otherwise memory calculations are used to determine when to spill.- Returns:
- The memory calculator to use.
-
innerNext
- Specified by:
innerNext
in classAbstractRecordBatch<T extends PhysicalOperator>
-
executeBuildPhase
Execute the BUILD phase; first read incoming and split rows into partitions; may decide to spill some of the partitions- Returns:
- Returns an
RecordBatch.IterOutcome
if a termination condition is reached. Otherwise returns null. - Throws:
SchemaChangeException
- schema change exception
-
isSpilledInner
public boolean isSpilledInner(int part) -
makeDebugString
This creates a string that summarizes the memory usage of the operator.- Returns:
- A memory dump string.
-
cancelIncoming
protected void cancelIncoming()- Overrides:
cancelIncoming
in classAbstractBinaryRecordBatch<T extends PhysicalOperator>
-
updateMetrics
public void updateMetrics() -
close
public void close()- Specified by:
close
in interfaceAutoCloseable
- Overrides:
close
in classAbstractRecordBatch<T extends PhysicalOperator>
-
createProbe
-
setupProbe
- Throws:
SchemaChangeException
-
buildHashTableConfig
-