Create and populate null columns for the case in which a SELECT statement
refers to columns that do not exist in the actual table. Nullable and array
types are suitable for null columns. (Drill defines an empty array as the
same as a null array: not true, but the best we have at present.) Required
types cannot be used as we don't know what value to set into the column
values.
Seeks to preserve "vector continuity" by reusing vectors when possible.
Cases:
- A column a was available in a prior reader (or batch), but is no longer
available, and is thus null. Reuses the type and vector of the prior reader
(or batch) to prevent trivial schema changes.
- A column has an implied type (specified in the metadata about the
column provided by the reader.) That type information is used instead of
the defined null column type.
- A column has no type information. The type becomes the null column type
defined by the reader (or nullable int by default.
- Required columns are not suitable. If any of the above found a required
type, convert the type to nullable.
- The resulting column and type, whatever it turned out to be, is placed
into the vector cache so that it can be reused by the next reader or batch,
to again preserve vector continuity.
The above rules eliminate "trivia" schema changes, but can still result in
"hard" schema changes if a required type is replaced by a nullable type.