Interface RowSet
- All Known Subinterfaces:
RowSet.ExtendableRowSet
,RowSet.HyperRowSet
,RowSet.SingleRowSet
- All Known Implementing Classes:
AbstractRowSet
,AbstractSingleRowSet
,DirectRowSet
,HyperRowSetImpl
,IndirectRowSet
A row set encapsulates a set of vectors and provides access to Drill's
various "views" of vectors: VectorContainer
,
VectorAccessible
, etc. The row set wraps a {#link TupleModel}
which holds the vectors and column metadata. This form is optimized
for easy use in testing; use other implementations for production code.
A row set is defined by a TupleMetadata
. For testing purposes, a row
set has a fixed schema; we don't allow changing the set of vectors
dynamically.
The row set also provides a simple way to write and read records using the
RowSetWriter
and RowSetReader
interfaces. As per Drill
conventions, a row set can be written (once), read many times, and finally
cleared.
Drill provides a large number of vector (data) types. Each requires a
type-specific way to set data. The row set writer uses a
ColumnWriter
to set each value in a way unique to the specific data type. Similarly, the
row set reader provides a ScalarReader
interface. In both cases, columns can be accessed by index number
(as defined in the schema) or by name.
A row set follows a schema. The schema starts as a
BatchSchema
, but is parsed and restructured into a variety of
forms. In the original form, maps contain their value vectors. In the
flattened form, all vectors for all maps (and the top-level tuple) are
collected into a single structure. Since this structure is for testing,
this somewhat-static structure works just file; we don't need the added
complexity that comes from building the schema and data dynamically.
Putting this all together, the typical life-cycle flow is:
- Define the schema using
SchemaBuilder
. - Create the row set from the schema.
- Populate the row set using a writer from
RowSet.ExtendableRowSet.writer(int)
. - Process the vector container using the code under test.
- Retrieve the results using a reader from
reader()
. - Dispose of vector memory with
clear()
.
-
Nested Class Summary
Modifier and TypeInterfaceDescriptionstatic interface
Single row set which is empty and allows writing.static interface
Row set comprised of multiple single row sets, along with an indirection vector (SV4).static interface
static interface
Row set that manages a single batch of rows. -
Method Summary
Modifier and TypeMethodDescriptionvoid
clear()
boolean
boolean
void
print()
Debug-only tool to visualize a row set for inspection.reader()
int
rowCount()
schema()
long
size()
Return the size in memory of this record set, including indirection vectors, null vectors, offset vectors and the entire (used and unused) data vectors.
-
Method Details
-
isExtendable
boolean isExtendable() -
isWritable
boolean isWritable() -
vectorAccessible
VectorAccessible vectorAccessible() -
container
VectorContainer container() -
rowCount
int rowCount() -
reader
RowSetReader reader() -
clear
void clear() -
schema
TupleMetadata schema() -
allocator
BufferAllocator allocator() -
indirectionType
BatchSchema.SelectionVectorMode indirectionType() -
print
void print()Debug-only tool to visualize a row set for inspection. Do not use this in production code. -
size
long size()Return the size in memory of this record set, including indirection vectors, null vectors, offset vectors and the entire (used and unused) data vectors.- Returns:
- memory size in bytes
-
batchSchema
BatchSchema batchSchema()
-