public class ScanLifecycle extends Object
ScanLifecycleBuilder
are
sufficient to drive the entire scan operator functionality.
Schema resolution and projection is done generically and is the same for all
data sources. Only the
reader (created via the factory class) differs from one type of file to
another.
The framework achieves the work described below by composing a set of detailed classes, each of which performs some specific task. This structure leaves the reader to simply infer schema and read data.
A reader may be "late schema", true "schema on read." In this case, the reader simply tells the result set loader to create a new column reader on the fly. The framework will work out if that new column is to be projected and will return either a real column writer (projected column) or a dummy column writer (unprojected column.)
See ScanSchemaTracker
for details about how the scan schema
evolves over the scan lifecycle.
ScanSchemaTracker
which resolves the scan schema over the
lifetime of the scan.
Implicit columns are unique to each storage plugin. At present, they
are defined only for the file system plugin. To handle such variation,
each extension defines a subclass of the ScanLifecycleBuilder
class to
create the implicit columns manager (and schema negotiator) unique to
a certain kind of scan.
Each reader is tracked by a ReaderLifecycle
which handles:
ResultSetLoader
for the reader.Constructor and Description |
---|
ScanLifecycle(OperatorContext context,
ScanLifecycleBuilder builder) |
Modifier and Type | Method and Description |
---|---|
BufferAllocator |
allocator() |
int |
batchCount() |
void |
close() |
OperatorContext |
context() |
CustomErrorContext |
errorContext() |
boolean |
hasOutputSchema() |
protected SchemaNegotiatorImpl |
newNegotiator(ReaderLifecycle readerLifecycle) |
RowBatchReader |
nextReader() |
ScanLifecycleBuilder |
options() |
TupleMetadata |
outputSchema() |
ReaderFactory<?> |
readerFactory() |
long |
rowCount() |
ScanSchemaTracker |
schemaTracker() |
void |
tallyBatch(int rowCount) |
ResultVectorCacheImpl |
vectorCache() |
public ScanLifecycle(OperatorContext context, ScanLifecycleBuilder builder)
public OperatorContext context()
public ScanLifecycleBuilder options()
public ScanSchemaTracker schemaTracker()
public ResultVectorCacheImpl vectorCache()
public ReaderFactory<?> readerFactory()
public boolean hasOutputSchema()
public CustomErrorContext errorContext()
public BufferAllocator allocator()
public int batchCount()
public long rowCount()
public void tallyBatch(int rowCount)
public RowBatchReader nextReader()
protected SchemaNegotiatorImpl newNegotiator(ReaderLifecycle readerLifecycle)
public TupleMetadata outputSchema()
public void close()
Copyright © 1970 The Apache Software Foundation. All rights reserved.