Class PriorityQueueCopierWrapper.BatchMerger
- All Implemented Interfaces:
AutoCloseable
,SortImpl.SortResults
- Enclosing class:
- PriorityQueueCopierWrapper
Input. Here the top line is a selection vector of indexes. The second line is a set of batch groups (separated by underscores) with letters indicating individual records:
[3 7 4 8 0 6 1] [5 3 6 8 2 0] [eh_ad_ibf] [r_qm_kn_p]
Output, assuming blocks of 5 records. The brackets represent batches, the line represents the set of batches copied to the spill file.
[abcde] [fhikm] [npqr]
The copying operation does a merge as well: copying values from the sources in ordered fashion. Consider a different example, we want to merge two input batches to produce a single output batch:
Input: [aceg] [bdfh] Output: [abcdefgh]
In the above, the input consists of two sorted batches. (In reality, the input batches have an associated selection vector, but that is omitted here and just the sorted values shown.) The output is a single batch with the merged records (indicated by letters) from the two input batches.
Here we bind the copier to the batchGroupList of sorted, buffered batches to be merged. We bind the copier output to outputContainer: the copier will write its merged "batches" of records to that container.
Calls to the next()
method sequentially return merged batches
of the desired row count.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
int
Container into which results are delivered.long
Gets the estimated batch size, in bytes.int
getSv2()
getSv4()
boolean
next()
Read the next merged batch.void
updateOutputContainer
(VectorContainer container, SelectionVector4 sv4, RecordBatch.IterOutcome outcome, BatchSchema schema)
-
Method Details
-
next
public boolean next()Read the next merged batch. The batch holds the specified row count, but may be less if this is the last batch.- Specified by:
next
in interfaceSortImpl.SortResults
- Returns:
- the number of rows in the batch, or 0 if no more batches are available
-
close
public void close()- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceSortImpl.SortResults
-
getRecordCount
public int getRecordCount()- Specified by:
getRecordCount
in interfaceSortImpl.SortResults
-
getBatchCount
public int getBatchCount()- Specified by:
getBatchCount
in interfaceSortImpl.SortResults
-
getEstBatchSize
public long getEstBatchSize()Gets the estimated batch size, in bytes. Use for estimating the memory needed to process the batches that this operator created.- Returns:
- the size of the largest batch created by this operation, in bytes
-
getSv4
- Specified by:
getSv4
in interfaceSortImpl.SortResults
-
updateOutputContainer
public void updateOutputContainer(VectorContainer container, SelectionVector4 sv4, RecordBatch.IterOutcome outcome, BatchSchema schema) - Specified by:
updateOutputContainer
in interfaceSortImpl.SortResults
-
getSv2
- Specified by:
getSv2
in interfaceSortImpl.SortResults
-
getContainer
Description copied from interface:SortImpl.SortResults
Container into which results are delivered. May the the original operator container, or may be a different one. This is the container that should be sent downstream. This is a fixed value for all returned results.- Specified by:
getContainer
in interfaceSortImpl.SortResults
- Returns:
-