Class HashPartition
java.lang.Object
org.apache.drill.exec.physical.impl.common.HashPartition
- All Implemented Interfaces:
HashJoinMemoryCalculator.PartitionStat
Overview
Created to represent an active partition for the Hash-Join operator (active means: currently receiving data, or its data is being probed; as opposed to fully spilled partitions). After all the build/inner data is read for this partition - if all its data is in memory, then a hash table and a helper are created, and later this data would be probed. If all this partition's build/inner data was spilled, then it begins to work as an outer partition (see the flag "processingOuter") -- reusing some of the fields (e.g., currentBatch, currHVVector, writer, spillFile, partitionBatchesCount) for the outer.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionHashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions) -
Method Summary
Modifier and TypeMethodDescriptionvoidAllocate a new current Vector Container and current HV vectorvoidappendBatch(VectorAccessible batch) Append the incoming batch (actually only the vectors of that batch) into the tmp listvoidappendInnerRow(VectorContainer buildContainer, int ind, int hashCode, HashJoinMemoryCalculator.BuildSidePartitioning calc) Spills if neededvoidappendOuterRow(int hashCode, int recordsProcessed) Outer always spills when batch is fullvoidCreates the hash table and join helper for this partition.voidcleanup(boolean deleteFile) Free all in-memory allocated structures.voidclose()voidClose the writer without deleting the spill filevoidcompleteAnInnerBatch(boolean toInitialize, boolean needsSpill) voidcompleteAnOuterBatch(boolean toInitialize) voiddecreaseRecordNumForKey(int currentIndex) intgetBuildHashCode(int ind) longintgetNextIndex(int compositeIndex) com.carrotsearch.hppc.IntArrayListintlongintintintgetProbeHashCode(int ind) intgetRecordNumForKey(int currentIndex) getStartIndex(int probeIndex) voidgetStats(HashTableStats newStats) booleanCreates a debugging string containing information about memory usage.org.apache.commons.lang3.tuple.Pair<VectorContainer, Integer> intprobeForKey(int recordsProcessed, int hashCode) booleansetRecordMatched(int compositeIndex) voidsetRecordNumForKey(int currentIndex, int num) voidvoidvoidupdateProbeRecordsPerBatch(int newRecordsPerBatch) Configure a different temporary batch size when spilling probe batches.
-
Field Details
-
HASH_VALUE_COLUMN_NAME
- See Also:
-
HVtype
-
-
Constructor Details
-
HashPartition
public HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions)
-
-
Method Details
-
updateProbeRecordsPerBatch
public void updateProbeRecordsPerBatch(int newRecordsPerBatch) Configure a different temporary batch size when spilling probe batches.- Parameters:
newRecordsPerBatch- The new temporary batch size to use.
-
allocateNewCurrentBatchAndHV
public void allocateNewCurrentBatchAndHV()Allocate a new current Vector Container and current HV vector -
appendInnerRow
public void appendInnerRow(VectorContainer buildContainer, int ind, int hashCode, HashJoinMemoryCalculator.BuildSidePartitioning calc) Spills if needed -
appendOuterRow
public void appendOuterRow(int hashCode, int recordsProcessed) Outer always spills when batch is full -
completeAnOuterBatch
public void completeAnOuterBatch(boolean toInitialize) -
completeAnInnerBatch
public void completeAnInnerBatch(boolean toInitialize, boolean needsSpill) -
appendBatch
Append the incoming batch (actually only the vectors of that batch) into the tmp list -
spillThisPartition
public void spillThisPartition() -
probeForKey
- Throws:
SchemaChangeException
-
getRecordNumForKey
public int getRecordNumForKey(int currentIndex) -
setRecordNumForKey
public void setRecordNumForKey(int currentIndex, int num) -
decreaseRecordNumForKey
public void decreaseRecordNumForKey(int currentIndex) -
getStartIndex
-
getNextIndex
public int getNextIndex(int compositeIndex) -
setRecordMatched
public boolean setRecordMatched(int compositeIndex) -
getNextUnmatchedIndex
public com.carrotsearch.hppc.IntArrayList getNextUnmatchedIndex() -
getBuildHashCode
- Throws:
SchemaChangeException
-
getProbeHashCode
- Throws:
SchemaChangeException
-
getContainers
-
updateBatches
- Throws:
SchemaChangeException
-
nextBatch
-
getInMemoryBatches
- Specified by:
getInMemoryBatchesin interfaceHashJoinMemoryCalculator.PartitionStat
-
getNumInMemoryBatches
public int getNumInMemoryBatches()- Specified by:
getNumInMemoryBatchesin interfaceHashJoinMemoryCalculator.PartitionStat
-
isSpilled
public boolean isSpilled()- Specified by:
isSpilledin interfaceHashJoinMemoryCalculator.PartitionStat
-
getNumInMemoryRecords
public long getNumInMemoryRecords()- Specified by:
getNumInMemoryRecordsin interfaceHashJoinMemoryCalculator.PartitionStat
-
getInMemorySize
public long getInMemorySize()- Specified by:
getInMemorySizein interfaceHashJoinMemoryCalculator.PartitionStat
-
getSpillFile
-
getPartitionBatchesCount
public int getPartitionBatchesCount() -
getPartitionNum
public int getPartitionNum() -
closeWriter
public void closeWriter()Close the writer without deleting the spill file -
buildContainersHashTableAndHelper
Creates the hash table and join helper for this partition. This method should only be called after all the build side records have been consumed.- Throws:
SchemaChangeException
-
getStats
-
cleanup
public void cleanup(boolean deleteFile) Free all in-memory allocated structures.- Parameters:
deleteFile- - whether to delete the spill file or not
-
close
public void close() -
makeDebugString
Creates a debugging string containing information about memory usage.- Returns:
- A debugging string.
-