Class ResultVectorCacheImpl
java.lang.Object
org.apache.drill.exec.physical.resultSet.impl.ResultVectorCacheImpl
- All Implemented Interfaces:
ResultVectorCache
Manages an inventory of value vectors used across row batch readers.
Drill semantics for batches is complex. Each operator logically returns
a batch of records on each call of the Drill Volcano iterator protocol
next() operation. However, the batches "returned" are not
separate objects. Instead, Drill enforces the following semantics:
- If a next() call returns OK then the set of vectors in the "returned" batch must be identical to those in the prior batch. Not just the same type; they must be the same ValueVector objects. (The buffers within the vectors will be different.)
- If the set of vectors changes in any way (add a vector, remove a vector, change the type of a vector), then the next() call must return OK_NEW_SCHEMA.
ResultSetLoader
class handles this by managing the set of vectors
used by a single reader.
Readers are independent: each may read a distinct schema (as in JSON.) Yet, the Drill protocol requires minimizing spurious OK_NEW_SCHEMA events. As a result, two readers run by the same scan operator must share the same set of vectors, despite the fact that they may have different schemas and thus different ResultSetLoaders.
The purpose of this inventory is to persist vectors across readers, even when, say, reader B does not use a vector that reader A created.
The semantics supported by this class include:
- Ability to "pre-declare" columns based on columns that appear in an explicit select list. This ensures that the columns are known (but not their types).
- Ability to reuse a vector across readers if the column retains the same name and type (minor type and mode.)
- Ability to flush unused vectors for readers with changing schemas if a schema change occurs.
- Support schema "hysteresis"; that is, the a "sticky" schema that minimizes spurious changes. Once a vector is declared, it can be included in all subsequent batches (provided the column is nullable or an array.)
-
Constructor Summary
ConstructorDescriptionResultVectorCacheImpl
(BufferAllocator allocator) ResultVectorCacheImpl
(BufferAllocator allocator, boolean permissiveMode) -
Method Summary
Modifier and TypeMethodDescriptionchildCache
(String colName) void
close()
boolean
void
newBatch()
void
void
vectorFor
(MaterializedField colSchema)
-
Constructor Details
-
ResultVectorCacheImpl
-
ResultVectorCacheImpl
-
-
Method Details
-
allocator
- Specified by:
allocator
in interfaceResultVectorCache
-
predefine
-
newBatch
public void newBatch() -
trimUnused
public void trimUnused() -
vectorFor
- Specified by:
vectorFor
in interfaceResultVectorCache
-
getType
- Specified by:
getType
in interfaceResultVectorCache
-
close
public void close() -
isPermissive
public boolean isPermissive()- Specified by:
isPermissive
in interfaceResultVectorCache
-
childCache
- Specified by:
childCache
in interfaceResultVectorCache
-