Class MergeSortWrapper

All Implemented Interfaces:
SortImpl.SortResults

public class MergeSortWrapper extends BaseSortWrapper implements SortImpl.SortResults
Wrapper around the "MSorter" (in memory merge sorter). As batches have arrived to the sort, they have been individually sorted and buffered in memory. At the completion of the sort, we detect that no batches were spilled to disk. In this case, we can merge the in-memory batches using an efficient memory-based approach implemented here.

Since all batches are in memory, we don't want to use the usual merge algorithm as that makes a copy of the original batches (which were read from a spill file) to produce an output batch. Instead, we want to use the in-memory batches as-is. To do this, we use a selection vector 4 (SV4) as a global index into the collection of batches. The SV4 uses the upper two bytes as the batch index, and the lower two as an offset of a record within the batch.

The merger ("M Sorter") populates the SV4 by scanning the set of in-memory batches, searching for the one with the lowest value of the sort key. The batch number and offset are placed into the SV4. The process continues until all records from all batches have an entry in the SV4.

The actual implementation uses an iterative merge to perform the above efficiently.

A sort can only do a single merge. So, we do not attempt to share the generated class; we just generate it internally and discard it at completion of the merge.

The merge sorter only makes sense when we have at least one row. The caller must handle the special case of no rows.