Package org.apache.drill.exec.vector.accessor.reader
package org.apache.drill.exec.vector.accessor.reader
Provides the reader hierarchy as explained in the API package.
Structure
The reader implementation divides into four parts:- The readers themselves which start with scalar readers to decode data from vectors, then build up to nullable, array, union and list readers. Readers are built up via composition, often using the (internal) offset vector reader.
- The column index abstraction that steps through items in a collection. At the top level, the index points to the current row. The top level may include an indirection (an SV2 or SV4) which is handled by the column index. Within arrays, the column index points to each element of the array.
- The vector accessor which provides a unified interface for both the single-batch and hyper-batch cases. The single-batch versions simply hold onto the vector itself. The hyper-batch versions either provide access to a specific vector within a hyper-vector (for top-level vectors), or navigate from a top-level vector down to an inner vector (for nested vectors.)
- The null state abstraction which provides a uniform way to detect nullability. For example, within the reader system, the reader for nullable and required vectors differ only in the associated null state reader. Unions and lists have complex null state logic: the nullability of a value depends on the nullability of the list, the union, and the value itself. The null state class implements this logic independent of the reader structure.
- Start with a single or hyper-vector batch.
- The reader builders in another package parse the batch structure, create the required metadata, wrap the (single or hyper) vectors in a vector accessor, and call methods in this package.
- Methods here perform the final construction based on the specific type of the reader.
- The work which is based on the vector structure and single/hyper-vector structure, which is done elsewhere.
- The work which is based on the structure of the readers (with vector cardinality factored out), which is done here.
Composition
The result is that reader structure makes heavy use of composition: readers are built up from each of the above components. The number of actual reader classes is small, but the methods to build the readers are complex. Most structure is built at build time. Indexes, however are provided at a later "bind" time at which a bind call traverses the reader tree to associate an index with each reader and vector accessor. When a reader is for an array, the bind step creates the index for the array elements.Construction
Construction of readers is a multi-part process.The work divides into two main categories:
-
ClassDescriptionReader for a tuple (a row or a map.) Provides access to each column using either a name or a numeric index.Reader for an array-valued column.Object representation of an array reader.Index into the vector of elements for a repeated vector.Column reader implementation that acts as the basis for the generated, vector-specific implementations.Provide access to the DrillBuf for the data vector.Specialized reader for bit columns.Gather generated reader classes into a set of class tables to allow rapid run-time creation of readers.Reader for a Dict entry.Reader for a Drill Map type.Internal mechanism to detect if a value is null.Handle the awkward situation with complex types.Holder for the NullableVector wrapper around a bits vector and a data vector.Null state that handles the strange union semantics that both the union and the values can be null.Holder for the NullableVector wrapper around a bits vector and a data vector.Dummy implementation of a null state reader for cases in which the value is never null.Extract null state from the union vector's type vector.Reader for an offset vector.Internal operations to wire up a set of readers.Reader for a union vector.Collection of vector accessors.Vector accessor for RepeatedVector → data vectorVector accessor for RepeatedVector → offsets vectorVector accessor used by the column accessors to obtain the vector for each column value.Vector accessor for ListVector → bits vectorVector accessor for AbstractMapVector → member vectorVector accessor for NullableVector → bits vectorVector accessor for NullableVector → values vectorVector accessor for UnionVector → data vectorVector accessor for UnionVector → type vectorVector accessor for VariableWidthVector → offsets vector