java.lang.Object
org.apache.drill.exec.store.parquet.columnreaders.ReadState

public class ReadState extends Object
Internal state for reading from a Parquet file. Tracks information required from one call of next() to the next.

At present, this is a bit of a muddle as it holds all read state. As such, this is a snapshot of a refactoring effort. Subsequent passes will move state into specific readers where possible.

  • Constructor Details

  • Method Details

    • buildReader

      public void buildReader(ParquetRecordReader reader, OutputMutator output) throws Exception
      Create the readers needed to read columns: fixed-length or variable length.
      Parameters:
      reader - parquet record reader
      output - output mutator
      Throws:
      Exception
    • getFirstColumnReader

      public ColumnReader<?> getFirstColumnReader()
      Several readers use the first column reader to get information about the whole record or group (such as row count.)
      Returns:
      the reader for the first column
    • resetBatch

      public void resetBatch()
    • schema

      public ParquetSchema schema()
    • batchSizerMgr

      public RecordBatchSizerManager batchSizerMgr()
    • getFixedLenColumnReaders

      public List<ColumnReader<?>> getFixedLenColumnReaders()
    • recordsRead

      public long recordsRead()
    • varLengthReader

      public VarLenBinaryReader varLengthReader()
    • getTotalRecordsToRead

      public long getTotalRecordsToRead()
    • useAsyncColReader

      public boolean useAsyncColReader()
    • parquetReaderStats

      public ParquetReaderStats parquetReaderStats()
    • getValuesReadInCurrentPass

      public int getValuesReadInCurrentPass()
      Returns:
      values read within the latest batch
    • getRemainingValuesToRead

      public int getRemainingValuesToRead()
      Returns:
      remaining values to read
    • setValuesReadInCurrentPass

      public void setValuesReadInCurrentPass(int valuesReadInCurrentBatch)
      Parameters:
      valuesReadInCurrentBatch - the valuesReadInCurrentBatch to set
    • fillNullVectors

      public void fillNullVectors(int readCount)
      When the SELECT clause references columns that do not exist in the Parquet file, we don't issue an error; instead we simply make up a column and fill it with nulls. This method does the work of null-filling the made-up vectors.
      Parameters:
      readCount - the number of rows read in the present record batch, which is the number of null column values to create
    • updateCounts

      public void updateCounts(int readCount)
    • close

      public void close()