Class HashTableTemplate
java.lang.Object
org.apache.drill.exec.physical.impl.common.HashTableTemplate
- All Implemented Interfaces:
HashTable
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from interface org.apache.drill.exec.physical.impl.common.HashTable
HashTable.PutStatus -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected ClassGenerator<?> protected FragmentContextstatic final intFields inherited from interface org.apache.drill.exec.physical.impl.common.HashTable
BATCH_MASK, BATCH_SIZE, DEFAULT_LOAD_FACTOR, MAXIMUM_CAPACITY, TEMPLATE_DEFINITION -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclear()Frees all the direct memory consumed by theHashTable.voiddecreaseRecordNumForKey(int currentIndex) Decrease the count of records for a specific key by one.protected abstract voiddoSetup(VectorContainer incomingBuild, RecordBatch incomingProbe) voidenlargeEmptyHashTableIfNeeded(int newNum) Resize up the Hash Table if needed (to hold newNum entries)longThe amount of direct memory consumed by the hash table.intgetBuildHashCode(int incomingRowIdx) Return the Hash Value for the row in the Build incoming batch at index: (For Hash Aggregate there's no "Build" side -- only one batch - this one)protected abstract intgetHashBuild(int incomingRowIdx, int seedValue) protected abstract intgetHashProbe(int incomingRowIdx, int seedValue) intgetProbeHashCode(int incomingRowIdx) Return the Hash Value for the row in the Probe incoming batch at index:intgetRecordNumForKey(int currentIndex) voidgetStats(HashTableStats stats) intprotected HashTableTemplate.BatchHolderinjectMembers(HashTableTemplate.BatchHolder batchHolder) booleanisEmpty()Returns a message containing memory usage statistics.protected HashTableTemplate.BatchHoldernewBatchHolder(int index, int newBatchHolderSize) org.apache.commons.lang3.tuple.Pair<VectorContainer, Integer> intintbooleanoutputKeys(int batchIdx, VectorContainer outContainer, int numRecords) Retrieves the key columns and transfers them to the output container.intprobeForKey(int incomingRowIdx, int hashCode) Return -1 if Probe-side key is not found in the (build-side) hash table.put(int incomingRowIdx, IndexPointer htIdxHolder, int hashCode, int targetBatchRowCount) put() uses the hash code (from gethashCode() above) to insert the key(s) from the incoming row into the hash table.voidreset()Reinit the hash table to its original size, and clear up all its prior batch holdervoidsetRecordNumForKey(int currentIndex, int num) Set the count of records for a specific key to num.voidsetTargetBatchRowCount(int batchRowCount) voidsetup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer incomingBuild, RecordBatch incomingProbe, RecordBatch outgoing, VectorContainer htContainerOrig, FragmentContext context, ClassGenerator<?> cg) HashTable.setup(org.apache.drill.exec.physical.impl.common.HashTableConfig, org.apache.drill.exec.memory.BufferAllocator, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.ops.FragmentContext, org.apache.drill.exec.expr.ClassGenerator<?>)must be called before anything can be done to theHashTable.intsize()voidUpdates the incoming (build and probe side) value vectors references in theHashTableTemplate.BatchHolders.voidupdateIncoming(VectorContainer newIncoming, RecordBatch newIncomingProbe) Changes the incoming probe and build side batches, and then updates all the value vector references in theHashTableTemplate.BatchHolders.voidupdateInitialCapacity(int initialCapacity) Update the initial capacity for the hash table.
-
Field Details
-
MAX_VARCHAR_SIZE
public static final int MAX_VARCHAR_SIZE- See Also:
-
context
-
cg
-
-
Constructor Details
-
HashTableTemplate
public HashTableTemplate()
-
-
Method Details
-
setup
public void setup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer incomingBuild, RecordBatch incomingProbe, RecordBatch outgoing, VectorContainer htContainerOrig, FragmentContext context, ClassGenerator<?> cg) Description copied from interface:HashTableHashTable.setup(org.apache.drill.exec.physical.impl.common.HashTableConfig, org.apache.drill.exec.memory.BufferAllocator, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.ops.FragmentContext, org.apache.drill.exec.expr.ClassGenerator<?>)must be called before anything can be done to theHashTable. -
updateInitialCapacity
public void updateInitialCapacity(int initialCapacity) Description copied from interface:HashTableUpdate the initial capacity for the hash table. This method will be removed after the key vectors are removed from the hash table. It is used to allocateHashTableTemplate.BatchHolders of appropriate size when the final size of the HashTable is known. Warning! Only call this method before you have inserted elements into the HashTable.- Specified by:
updateInitialCapacityin interfaceHashTable- Parameters:
initialCapacity- The new initial capacity to use.
-
updateBatches
Description copied from interface:HashTableUpdates the incoming (build and probe side) value vectors references in theHashTableTemplate.BatchHolders. This is useful on OK_NEW_SCHEMA (need to verify).- Specified by:
updateBatchesin interfaceHashTable- Throws:
SchemaChangeException
-
numBuckets
public int numBuckets() -
numResizing
public int numResizing() -
size
public int size() -
getStats
-
isEmpty
public boolean isEmpty() -
clear
public void clear()Description copied from interface:HashTableFrees all the direct memory consumed by theHashTable. -
getBuildHashCode
Return the Hash Value for the row in the Build incoming batch at index: (For Hash Aggregate there's no "Build" side -- only one batch - this one)- Specified by:
getBuildHashCodein interfaceHashTable- Parameters:
incomingRowIdx-- Returns:
- Throws:
SchemaChangeException
-
getProbeHashCode
Return the Hash Value for the row in the Probe incoming batch at index:- Specified by:
getProbeHashCodein interfaceHashTable- Parameters:
incomingRowIdx-- Returns:
- Throws:
SchemaChangeException
-
put
public HashTable.PutStatus put(int incomingRowIdx, IndexPointer htIdxHolder, int hashCode, int targetBatchRowCount) throws SchemaChangeException, RetryAfterSpillException put() uses the hash code (from gethashCode() above) to insert the key(s) from the incoming row into the hash table. The code selects the bucket in the startIndices, then the keys are placed into the chained list - by storing the key values into a batch, and updating its "links" member. Last it modifies the index holder to the batch offset so that the caller can store the remaining parts of the row into a matching batch (outside the hash table). Returning- Specified by:
putin interfaceHashTable- Parameters:
incomingRowIdx- - position of the incoming rowhtIdxHolder- - to return batch + batch-offset (for caller to manage a matching batch)hashCode- - computed over the key(s) by calling getBuildHashCode()- Returns:
- Status - the key(s) was ADDED or was already PRESENT
- Throws:
SchemaChangeExceptionRetryAfterSpillException
-
probeForKey
Return -1 if Probe-side key is not found in the (build-side) hash table. Otherwise, return the global index of the key- Specified by:
probeForKeyin interfaceHashTable- Parameters:
incomingRowIdx-hashCode- - The hash code for the Probe-side key- Returns:
- -1 if key is not found, else return the global index of the key
- Throws:
SchemaChangeException
-
getRecordNumForKey
public int getRecordNumForKey(int currentIndex) - Specified by:
getRecordNumForKeyin interfaceHashTable- Parameters:
currentIndex- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).- Returns:
- Returns -1 if the count of records for a specific key is not computed. Otherwise returns the count of records for a specific key.
-
setRecordNumForKey
public void setRecordNumForKey(int currentIndex, int num) Description copied from interface:HashTableSet the count of records for a specific key to num.- Specified by:
setRecordNumForKeyin interfaceHashTable- Parameters:
currentIndex- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).num- The count of records for a specific key to be set.
-
decreaseRecordNumForKey
public void decreaseRecordNumForKey(int currentIndex) Description copied from interface:HashTableDecrease the count of records for a specific key by one.- Specified by:
decreaseRecordNumForKeyin interfaceHashTable- Parameters:
currentIndex- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
-
newBatchHolder
-
injectMembers
-
enlargeEmptyHashTableIfNeeded
public void enlargeEmptyHashTableIfNeeded(int newNum) Resize up the Hash Table if needed (to hold newNum entries) -
reset
public void reset()Reinit the hash table to its original size, and clear up all its prior batch holder -
updateIncoming
Description copied from interface:HashTableChanges the incoming probe and build side batches, and then updates all the value vector references in theHashTableTemplate.BatchHolders.- Specified by:
updateIncomingin interfaceHashTable- Parameters:
newIncoming- The new build side batch.newIncomingProbe- The new probe side batch.
-
outputKeys
Description copied from interface:HashTableRetrieves the key columns and transfers them to the output container. Note this operation removes the key columns from theHashTable.- Specified by:
outputKeysin interfaceHashTable- Parameters:
batchIdx- The index of aHashTableTemplate.BatchHolderin the HashTable.outContainer- The destination container for the key columns.numRecords- The number of key recorts to transfer.- Returns:
-
nextBatch
-
doSetup
protected abstract void doSetup(@Named("incomingBuild") VectorContainer incomingBuild, @Named("incomingProbe") RecordBatch incomingProbe) throws SchemaChangeException - Throws:
SchemaChangeException
-
getHashBuild
protected abstract int getHashBuild(@Named("incomingRowIdx") int incomingRowIdx, @Named("seedValue") int seedValue) throws SchemaChangeException - Throws:
SchemaChangeException
-
getHashProbe
protected abstract int getHashProbe(@Named("incomingRowIdx") int incomingRowIdx, @Named("seedValue") int seedValue) throws SchemaChangeException - Throws:
SchemaChangeException
-
getActualSize
public long getActualSize()Description copied from interface:HashTableThe amount of direct memory consumed by the hash table.- Specified by:
getActualSizein interfaceHashTable- Returns:
-
makeDebugString
Description copied from interface:HashTableReturns a message containing memory usage statistics. Intended to be used for printing debugging or error messages.- Specified by:
makeDebugStringin interfaceHashTable- Returns:
- A debug string.
-
setTargetBatchRowCount
public void setTargetBatchRowCount(int batchRowCount) - Specified by:
setTargetBatchRowCountin interfaceHashTable
-
getTargetBatchRowCount
public int getTargetBatchRowCount()- Specified by:
getTargetBatchRowCountin interfaceHashTable
-