public class OutputBatchBuilder extends Object
Handles maps, which can overlap at the map level (two inputs can hold a map
column named `m`
, say), but the map members must be disjoint. Applies
the same rule recursively to nested maps.
Maps must be built with members in the same order as the corresponding schema. Though maps are usually thought of as unordered name/value pairs, they are actually tuples, with both a name and a defined ordering.
This code uses a name lookup in maps because the semantics of maps do not
guarantee a uniform numbering of members from 0
to n-1
, where
{code n} is the number of map members. Map members are ordered, but the
ordinal used by the map vector is not necessarily sequential.
Once the output container is built, the same value vectors reside in the input and output containers. This works because Drill requires vector persistence: the same vectors must be presented downstream in every batch until a schema change occurs.
[ 1 | 2 | 3 | 4 ] Table columns in table order
[ A | B | C ] Static columns
Now, we wish to project them into select order.
Let's say that the SELECT clause looked like this, with "t"
indicating table columns:
SELECT t2, t3, C, B, t1, A, t2 ...
Then the projection looks like this:
[ 2 | 3 | C | B | 1 | A | 2 ]
Often, not all table columns are projected. In this case, the
result set loader presents the full table schema to the reader,
but actually writes only the projected columns. Suppose we
have:
SELECT t3, C, B, t1,, A ...
Then the abbreviated table schema looks like this:
[ 1 | 3 ]
Note that table columns retain their table ordering.
The projection looks like this:
[ 2 | C | B | 1 | A ]
The projector is created once per schema, then can be reused for any number of batches.
Merging is done in one of two ways, depending on the input source:
Modifier and Type | Class and Description |
---|---|
static class |
OutputBatchBuilder.BatchSource
Describes an input batch with a schema and a vector container.
|
static class |
OutputBatchBuilder.MapSource
Source map as a map schema and map vector.
|
Constructor and Description |
---|
OutputBatchBuilder(TupleMetadata outputSchema,
List<OutputBatchBuilder.BatchSource> sources,
BufferAllocator allocator) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Release per-reader resources.
|
protected void |
defineSourceBatchMapping(TupleMetadata schema,
int source)
Define the mapping for one of the sources.
|
ValueVector |
getVector(org.apache.drill.exec.physical.impl.scan.v3.lifecycle.OutputBatchBuilder.VectorSource source) |
void |
load(int rowCount) |
VectorContainer |
outputContainer() |
public OutputBatchBuilder(TupleMetadata outputSchema, List<OutputBatchBuilder.BatchSource> sources, BufferAllocator allocator)
protected void defineSourceBatchMapping(TupleMetadata schema, int source)
public ValueVector getVector(org.apache.drill.exec.physical.impl.scan.v3.lifecycle.OutputBatchBuilder.VectorSource source)
public void load(int rowCount)
public VectorContainer outputContainer()
public void close()
Copyright © 1970 The Apache Software Foundation. All rights reserved.