org.apache.drill.exec.physical.resultSet.impl.ResultVectorCacheImpl

All Implemented Interfaces:: ResultVectorCache

public class ResultVectorCacheImpl extends Object implements ResultVectorCache

Manages an inventory of value vectors used across row batch readers. Drill semantics for batches is complex. Each operator logically returns a batch of records on each call of the Drill Volcano iterator protocol next() operation. However, the batches "returned" are not separate objects. Instead, Drill enforces the following semantics:

If a next() call returns OK then the set of vectors in the "returned" batch must be identical to those in the prior batch. Not just the same type; they must be the same ValueVector objects. (The buffers within the vectors will be different.)
If the set of vectors changes in any way (add a vector, remove a vector, change the type of a vector), then the next() call must return OK_NEW_SCHEMA.

These rules create interesting constraints for the scan operator. Conceptually, each batch is distinct. But, it must share vectors. The ResultSetLoader class handles this by managing the set of vectors used by a single reader.

Readers are independent: each may read a distinct schema (as in JSON.) Yet, the Drill protocol requires minimizing spurious OK_NEW_SCHEMA events. As a result, two readers run by the same scan operator must share the same set of vectors, despite the fact that they may have different schemas and thus different ResultSetLoaders.

The purpose of this inventory is to persist vectors across readers, even when, say, reader B does not use a vector that reader A created.

The semantics supported by this class include:

Ability to "pre-declare" columns based on columns that appear in an explicit select list. This ensures that the columns are known (but not their types).
Ability to reuse a vector across readers if the column retains the same name and type (minor type and mode.)
Ability to flush unused vectors for readers with changing schemas if a schema change occurs.
Support schema "hysteresis"; that is, the a "sticky" schema that minimizes spurious changes. Once a vector is declared, it can be included in all subsequent batches (provided the column is nullable or an array.)

Constructor Summary

Constructors

Constructor

Description

ResultVectorCacheImpl(BufferAllocator allocator)

ResultVectorCacheImpl(BufferAllocator allocator, boolean permissiveMode)
Method Summary

Modifier and Type

Method

Description

BufferAllocator

allocator()

ResultVectorCache

childCache(String colName)

void

close()

TypeProtos.MajorType

getType(String name)

boolean

isPermissive()

void

newBatch()

void

predefine(List<String> selected)

void

trimUnused()

ValueVector

vectorFor(MaterializedField colSchema)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- ResultVectorCacheImpl
  
  public ResultVectorCacheImpl(BufferAllocator allocator)
- ResultVectorCacheImpl
  
  public ResultVectorCacheImpl(BufferAllocator allocator, boolean permissiveMode)
Method Details
- allocator
  
  public BufferAllocator allocator()
  
  Specified by:
  
  allocator in interface ResultVectorCache
- predefine
  
  public void predefine(List<String> selected)
- newBatch
  
  public void newBatch()
- trimUnused
  
  public void trimUnused()
- vectorFor
  
  public ValueVector vectorFor(MaterializedField colSchema)
  
  Specified by:
  
  vectorFor in interface ResultVectorCache
- getType
  
  public TypeProtos.MajorType getType(String name)
  
  Specified by:
  
  getType in interface ResultVectorCache
- close
  
  public void close()
- isPermissive
  
  public boolean isPermissive()
  
  Specified by:
  
  isPermissive in interface ResultVectorCache
- childCache
  
  public ResultVectorCache childCache(String colName)
  
  Specified by:
  
  childCache in interface ResultVectorCache

Class ResultVectorCacheImpl

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

ResultVectorCacheImpl

ResultVectorCacheImpl

Method Details

allocator

predefine

newBatch

trimUnused

vectorFor

getType

close

isPermissive

childCache