Interface HashTable

All Known Implementing Classes:
HashTableTemplate

public interface HashTable
  • Field Details

  • Method Details

    • setup

      void setup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer incomingBuild, RecordBatch incomingProbe, RecordBatch outgoing, VectorContainer htContainerOrig, FragmentContext context, ClassGenerator<?> cg)
      Parameters:
      htConfig -
      allocator -
      incomingBuild -
      incomingProbe -
      outgoing -
      htContainerOrig -
      context -
      cg -
    • updateBatches

      void updateBatches() throws SchemaChangeException
      Updates the incoming (build and probe side) value vectors references in the HashTableTemplate.BatchHolders. This is useful on OK_NEW_SCHEMA (need to verify).
      Throws:
      SchemaChangeException
    • getBuildHashCode

      int getBuildHashCode(int incomingRowIdx) throws SchemaChangeException
      Computes the hash code for the record at the given index in the build side batch.
      Parameters:
      incomingRowIdx - The index of the build side record of interest.
      Returns:
      The hash code for the record at the given index in the build side batch.
      Throws:
      SchemaChangeException
    • getProbeHashCode

      int getProbeHashCode(int incomingRowIdx) throws SchemaChangeException
      Computes the hash code for the record at the given index in the probe side batch.
      Parameters:
      incomingRowIdx - The index of the probe side record of interest.
      Returns:
      The hash code for the record at the given index in the probe side batch.
      Throws:
      SchemaChangeException
    • put

      HashTable.PutStatus put(int incomingRowIdx, IndexPointer htIdxHolder, int hashCode, int batchSize) throws SchemaChangeException, RetryAfterSpillException
      Throws:
      SchemaChangeException
      RetryAfterSpillException
    • probeForKey

      int probeForKey(int incomingRowIdx, int hashCode) throws SchemaChangeException
      Parameters:
      incomingRowIdx - The index of the key in the probe batch.
      hashCode - The hashCode of the key.
      Returns:
      Returns -1 if the data in the probe batch at the given incomingRowIdx is not in the hash table. Otherwise returns the composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
      Throws:
      SchemaChangeException
    • getRecordNumForKey

      int getRecordNumForKey(int currentIndex)
      Parameters:
      currentIndex - The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
      Returns:
      Returns -1 if the count of records for a specific key is not computed. Otherwise returns the count of records for a specific key.
    • setRecordNumForKey

      void setRecordNumForKey(int currentIndex, int num)
      Set the count of records for a specific key to num.
      Parameters:
      currentIndex - The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
      num - The count of records for a specific key to be set.
    • decreaseRecordNumForKey

      void decreaseRecordNumForKey(int currentIndex)
      Decrease the count of records for a specific key by one.
      Parameters:
      currentIndex - The composite index of the key in the hash table (index of BatchHolder and record in Batch Holder).
    • getStats

      void getStats(HashTableStats stats)
    • size

      int size()
    • isEmpty

      boolean isEmpty()
    • clear

      void clear()
      Frees all the direct memory consumed by the HashTable.
    • updateInitialCapacity

      void updateInitialCapacity(int initialCapacity)
      Update the initial capacity for the hash table. This method will be removed after the key vectors are removed from the hash table. It is used to allocate HashTableTemplate.BatchHolders of appropriate size when the final size of the HashTable is known. Warning! Only call this method before you have inserted elements into the HashTable.
      Parameters:
      initialCapacity - The new initial capacity to use.
    • updateIncoming

      void updateIncoming(VectorContainer newIncoming, RecordBatch newIncomingProbe)
      Changes the incoming probe and build side batches, and then updates all the value vector references in the HashTableTemplate.BatchHolders.
      Parameters:
      newIncoming - The new build side batch.
      newIncomingProbe - The new probe side batch.
    • reset

      void reset()
      Clears all the memory used by the HashTable and re-initializes it.
    • outputKeys

      boolean outputKeys(int batchIdx, VectorContainer outContainer, int numRecords)
      Retrieves the key columns and transfers them to the output container. Note this operation removes the key columns from the HashTable.
      Parameters:
      batchIdx - The index of a HashTableTemplate.BatchHolder in the HashTable.
      outContainer - The destination container for the key columns.
      numRecords - The number of key recorts to transfer.
      Returns:
    • makeDebugString

      String makeDebugString()
      Returns a message containing memory usage statistics. Intended to be used for printing debugging or error messages.
      Returns:
      A debug string.
    • getActualSize

      long getActualSize()
      The amount of direct memory consumed by the hash table.
      Returns:
    • setTargetBatchRowCount

      void setTargetBatchRowCount(int batchRowCount)
    • getTargetBatchRowCount

      int getTargetBatchRowCount()
    • nextBatch

      org.apache.commons.lang3.tuple.Pair<VectorContainer,Integer> nextBatch()