Class ReadState
java.lang.Object
org.apache.drill.exec.store.parquet.columnreaders.ReadState
Internal state for reading from a Parquet file. Tracks information
required from one call of next() to the next.
At present, this is a bit of a muddle as it holds all read state. As such, this is a snapshot of a refactoring effort. Subsequent passes will move state into specific readers where possible.
-
Constructor Summary
ConstructorDescriptionReadState
(ParquetSchema schema, RecordBatchSizerManager batchSizerMgr, ParquetReaderStats parquetReaderStats, long numRecordsToRead, boolean useAsyncColReader) -
Method Summary
Modifier and TypeMethodDescriptionvoid
buildReader
(ParquetRecordReader reader, OutputMutator output) Create the readers needed to read columns: fixed-length or variable length.void
close()
void
fillNullVectors
(int readCount) When the SELECT clause references columns that do not exist in the Parquet file, we don't issue an error; instead we simply make up a column and fill it with nulls.ColumnReader<?>
Several readers use the first column reader to get information about the whole record or group (such as row count.)List<ColumnReader<?>>
int
long
int
long
void
schema()
void
setValuesReadInCurrentPass
(int valuesReadInCurrentBatch) void
updateCounts
(int readCount) boolean
-
Constructor Details
-
ReadState
public ReadState(ParquetSchema schema, RecordBatchSizerManager batchSizerMgr, ParquetReaderStats parquetReaderStats, long numRecordsToRead, boolean useAsyncColReader)
-
-
Method Details
-
buildReader
Create the readers needed to read columns: fixed-length or variable length.- Parameters:
reader
- parquet record readeroutput
- output mutator- Throws:
Exception
-
getFirstColumnReader
Several readers use the first column reader to get information about the whole record or group (such as row count.)- Returns:
- the reader for the first column
-
resetBatch
public void resetBatch() -
schema
-
batchSizerMgr
-
getFixedLenColumnReaders
-
recordsRead
public long recordsRead() -
varLengthReader
-
getTotalRecordsToRead
public long getTotalRecordsToRead() -
useAsyncColReader
public boolean useAsyncColReader() -
parquetReaderStats
-
getValuesReadInCurrentPass
public int getValuesReadInCurrentPass()- Returns:
- values read within the latest batch
-
getRemainingValuesToRead
public int getRemainingValuesToRead()- Returns:
- remaining values to read
-
setValuesReadInCurrentPass
public void setValuesReadInCurrentPass(int valuesReadInCurrentBatch) - Parameters:
valuesReadInCurrentBatch
- the valuesReadInCurrentBatch to set
-
fillNullVectors
public void fillNullVectors(int readCount) When the SELECT clause references columns that do not exist in the Parquet file, we don't issue an error; instead we simply make up a column and fill it with nulls. This method does the work of null-filling the made-up vectors.- Parameters:
readCount
- the number of rows read in the present record batch, which is the number of null column values to create
-
updateCounts
public void updateCounts(int readCount) -
close
public void close()
-