Class HashTableTemplate
java.lang.Object
org.apache.drill.exec.physical.impl.common.HashTableTemplate
- All Implemented Interfaces:
HashTable
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.drill.exec.physical.impl.common.HashTable
HashTable.PutStatus
-
Field Summary
Modifier and TypeFieldDescriptionprotected ClassGenerator<?>
protected FragmentContext
static final int
Fields inherited from interface org.apache.drill.exec.physical.impl.common.HashTable
BATCH_MASK, BATCH_SIZE, DEFAULT_LOAD_FACTOR, MAXIMUM_CAPACITY, TEMPLATE_DEFINITION
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
clear()
Frees all the direct memory consumed by theHashTable
.void
decreaseRecordNumForKey
(int currentIndex) Decrease the count of records for a specific key by one.protected abstract void
doSetup
(VectorContainer incomingBuild, RecordBatch incomingProbe) void
enlargeEmptyHashTableIfNeeded
(int newNum) Resize up the Hash Table if needed (to hold newNum entries)long
The amount of direct memory consumed by the hash table.int
getBuildHashCode
(int incomingRowIdx) Return the Hash Value for the row in the Build incoming batch at index: (For Hash Aggregate there's no "Build" side -- only one batch - this one)protected abstract int
getHashBuild
(int incomingRowIdx, int seedValue) protected abstract int
getHashProbe
(int incomingRowIdx, int seedValue) int
getProbeHashCode
(int incomingRowIdx) Return the Hash Value for the row in the Probe incoming batch at index:int
getRecordNumForKey
(int currentIndex) void
getStats
(HashTableStats stats) int
protected HashTableTemplate.BatchHolder
injectMembers
(HashTableTemplate.BatchHolder batchHolder) boolean
isEmpty()
Returns a message containing memory usage statistics.protected HashTableTemplate.BatchHolder
newBatchHolder
(int index, int newBatchHolderSize) org.apache.commons.lang3.tuple.Pair<VectorContainer,
Integer> int
int
boolean
outputKeys
(int batchIdx, VectorContainer outContainer, int numRecords) Retrieves the key columns and transfers them to the output container.int
probeForKey
(int incomingRowIdx, int hashCode) Return -1 if Probe-side key is not found in the (build-side) hash table.put
(int incomingRowIdx, IndexPointer htIdxHolder, int hashCode, int targetBatchRowCount) put() uses the hash code (from gethashCode() above) to insert the key(s) from the incoming row into the hash table.void
reset()
Reinit the hash table to its original size, and clear up all its prior batch holdervoid
setRecordNumForKey
(int currentIndex, int num) Set the count of records for a specific key to num.void
setTargetBatchRowCount
(int batchRowCount) void
setup
(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer incomingBuild, RecordBatch incomingProbe, RecordBatch outgoing, VectorContainer htContainerOrig, FragmentContext context, ClassGenerator<?> cg) HashTable.setup(org.apache.drill.exec.physical.impl.common.HashTableConfig, org.apache.drill.exec.memory.BufferAllocator, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.ops.FragmentContext, org.apache.drill.exec.expr.ClassGenerator<?>)
must be called before anything can be done to theHashTable
.int
size()
void
Updates the incoming (build and probe side) value vectors references in theHashTableTemplate.BatchHolder
s.void
updateIncoming
(VectorContainer newIncoming, RecordBatch newIncomingProbe) Changes the incoming probe and build side batches, and then updates all the value vector references in theHashTableTemplate.BatchHolder
s.void
updateInitialCapacity
(int initialCapacity) Update the initial capacity for the hash table.
-
Field Details
-
MAX_VARCHAR_SIZE
public static final int MAX_VARCHAR_SIZE- See Also:
-
context
-
cg
-
-
Constructor Details
-
HashTableTemplate
public HashTableTemplate()
-
-
Method Details
-
setup
public void setup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer incomingBuild, RecordBatch incomingProbe, RecordBatch outgoing, VectorContainer htContainerOrig, FragmentContext context, ClassGenerator<?> cg) Description copied from interface:HashTable
HashTable.setup(org.apache.drill.exec.physical.impl.common.HashTableConfig, org.apache.drill.exec.memory.BufferAllocator, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.RecordBatch, org.apache.drill.exec.record.VectorContainer, org.apache.drill.exec.ops.FragmentContext, org.apache.drill.exec.expr.ClassGenerator<?>)
must be called before anything can be done to theHashTable
. -
updateInitialCapacity
public void updateInitialCapacity(int initialCapacity) Description copied from interface:HashTable
Update the initial capacity for the hash table. This method will be removed after the key vectors are removed from the hash table. It is used to allocateHashTableTemplate.BatchHolder
s of appropriate size when the final size of the HashTable is known. Warning! Only call this method before you have inserted elements into the HashTable.- Specified by:
updateInitialCapacity
in interfaceHashTable
- Parameters:
initialCapacity
- The new initial capacity to use.
-
updateBatches
Description copied from interface:HashTable
Updates the incoming (build and probe side) value vectors references in theHashTableTemplate.BatchHolder
s. This is useful on OK_NEW_SCHEMA (need to verify).- Specified by:
updateBatches
in interfaceHashTable
- Throws:
SchemaChangeException
-
numBuckets
public int numBuckets() -
numResizing
public int numResizing() -
size
public int size() -
getStats
-
isEmpty
public boolean isEmpty() -
clear
public void clear()Description copied from interface:HashTable
Frees all the direct memory consumed by theHashTable
. -
getBuildHashCode
Return the Hash Value for the row in the Build incoming batch at index: (For Hash Aggregate there's no "Build" side -- only one batch - this one)- Specified by:
getBuildHashCode
in interfaceHashTable
- Parameters:
incomingRowIdx
-- Returns:
- Throws:
SchemaChangeException
-
getProbeHashCode
Return the Hash Value for the row in the Probe incoming batch at index:- Specified by:
getProbeHashCode
in interfaceHashTable
- Parameters:
incomingRowIdx
-- Returns:
- Throws:
SchemaChangeException
-
put
public HashTable.PutStatus put(int incomingRowIdx, IndexPointer htIdxHolder, int hashCode, int targetBatchRowCount) throws SchemaChangeException, RetryAfterSpillException put() uses the hash code (from gethashCode() above) to insert the key(s) from the incoming row into the hash table. The code selects the bucket in the startIndices, then the keys are placed into the chained list - by storing the key values into a batch, and updating its "links" member. Last it modifies the index holder to the batch offset so that the caller can store the remaining parts of the row into a matching batch (outside the hash table). Returning- Specified by:
put
in interfaceHashTable
- Parameters:
incomingRowIdx
- - position of the incoming rowhtIdxHolder
- - to return batch + batch-offset (for caller to manage a matching batch)hashCode
- - computed over the key(s) by calling getBuildHashCode()- Returns:
- Status - the key(s) was ADDED or was already PRESENT
- Throws:
SchemaChangeException
RetryAfterSpillException
-
probeForKey
Return -1 if Probe-side key is not found in the (build-side) hash table. Otherwise, return the global index of the key- Specified by:
probeForKey
in interfaceHashTable
- Parameters:
incomingRowIdx
-hashCode
- - The hash code for the Probe-side key- Returns:
- -1 if key is not found, else return the global index of the key
- Throws:
SchemaChangeException
-
getRecordNumForKey
public int getRecordNumForKey(int currentIndex) - Specified by:
getRecordNumForKey
in interfaceHashTable
- Parameters:
currentIndex
- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).- Returns:
- Returns -1 if the count of records for a specific key is not computed. Otherwise returns the count of records for a specific key.
-
setRecordNumForKey
public void setRecordNumForKey(int currentIndex, int num) Description copied from interface:HashTable
Set the count of records for a specific key to num.- Specified by:
setRecordNumForKey
in interfaceHashTable
- Parameters:
currentIndex
- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).num
- The count of records for a specific key to be set.
-
decreaseRecordNumForKey
public void decreaseRecordNumForKey(int currentIndex) Description copied from interface:HashTable
Decrease the count of records for a specific key by one.- Specified by:
decreaseRecordNumForKey
in interfaceHashTable
- Parameters:
currentIndex
- The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
-
newBatchHolder
-
injectMembers
-
enlargeEmptyHashTableIfNeeded
public void enlargeEmptyHashTableIfNeeded(int newNum) Resize up the Hash Table if needed (to hold newNum entries) -
reset
public void reset()Reinit the hash table to its original size, and clear up all its prior batch holder -
updateIncoming
Description copied from interface:HashTable
Changes the incoming probe and build side batches, and then updates all the value vector references in theHashTableTemplate.BatchHolder
s.- Specified by:
updateIncoming
in interfaceHashTable
- Parameters:
newIncoming
- The new build side batch.newIncomingProbe
- The new probe side batch.
-
outputKeys
Description copied from interface:HashTable
Retrieves the key columns and transfers them to the output container. Note this operation removes the key columns from theHashTable
.- Specified by:
outputKeys
in interfaceHashTable
- Parameters:
batchIdx
- The index of aHashTableTemplate.BatchHolder
in the HashTable.outContainer
- The destination container for the key columns.numRecords
- The number of key recorts to transfer.- Returns:
-
nextBatch
-
doSetup
protected abstract void doSetup(@Named("incomingBuild") VectorContainer incomingBuild, @Named("incomingProbe") RecordBatch incomingProbe) throws SchemaChangeException - Throws:
SchemaChangeException
-
getHashBuild
protected abstract int getHashBuild(@Named("incomingRowIdx") int incomingRowIdx, @Named("seedValue") int seedValue) throws SchemaChangeException - Throws:
SchemaChangeException
-
getHashProbe
protected abstract int getHashProbe(@Named("incomingRowIdx") int incomingRowIdx, @Named("seedValue") int seedValue) throws SchemaChangeException - Throws:
SchemaChangeException
-
getActualSize
public long getActualSize()Description copied from interface:HashTable
The amount of direct memory consumed by the hash table.- Specified by:
getActualSize
in interfaceHashTable
- Returns:
-
makeDebugString
Description copied from interface:HashTable
Returns a message containing memory usage statistics. Intended to be used for printing debugging or error messages.- Specified by:
makeDebugString
in interfaceHashTable
- Returns:
- A debug string.
-
setTargetBatchRowCount
public void setTargetBatchRowCount(int batchRowCount) - Specified by:
setTargetBatchRowCount
in interfaceHashTable
-
getTargetBatchRowCount
public int getTargetBatchRowCount()- Specified by:
getTargetBatchRowCount
in interfaceHashTable
-