Class ParquetRecordReader
java.lang.Object
org.apache.drill.exec.store.AbstractRecordReader
org.apache.drill.exec.store.CommonParquetRecordReader
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
- All Implemented Interfaces:
AutoCloseable
,RecordReader
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
CommonParquetRecordReader.Metric
-
Field Summary
Fields inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
footer, fragmentContext, NUM_RECORDS_TO_READ_NOT_SPECIFIED, operatorContext, parquetReaderStats
Fields inherited from class org.apache.drill.exec.store.AbstractRecordReader
DEFAULT_TEXT_COLS_TO_READ
Fields inherited from interface org.apache.drill.exec.store.RecordReader
ALLOCATOR_INITIAL_RESERVATION, ALLOCATOR_MAX_RESERVATION
-
Constructor Summary
ConstructorDescriptionParquetRecordReader
(FragmentContext fragmentContext, long numRecordsToRead, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) ParquetRecordReader
(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, long numRecordsToRead, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) ParquetRecordReader
(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
Method Summary
Modifier and TypeMethodDescriptionvoid
allocate
(Map<String, ValueVector> vectorMap) void
close()
org.apache.parquet.compression.CompressionCodecFactory
Flag indicating if the old non-standard data format appears in this file, see DRILL-4203.protected List<SchemaPath>
org.apache.hadoop.fs.FileSystem
org.apache.hadoop.fs.Path
int
int
next()
Read the next record batch from the file using the reader and read state created previously.void
setup
(OperatorContext operatorContext, OutputMutator output) Prepare the Parquet reader.toString()
boolean
Methods inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
closeStats, handleAndRaise, initNumRecordsToRead, updateRowGroupsStats
Methods inherited from class org.apache.drill.exec.store.AbstractRecordReader
getColumns, hasNext, isSkipQuery, isStarQuery, setColumns, transformColumns
-
Constructor Details
-
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, long numRecordsToRead, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, long numRecordsToRead, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus)
-
-
Method Details
-
getDateCorruptionStatus
Flag indicating if the old non-standard data format appears in this file, see DRILL-4203.- Returns:
- true if the dates are corrupted and need to be corrected
-
getCodecFactory
public org.apache.parquet.compression.CompressionCodecFactory getCodecFactory() -
getHadoopPath
public org.apache.hadoop.fs.Path getHadoopPath() -
getFileSystem
public org.apache.hadoop.fs.FileSystem getFileSystem() -
getRowGroupIndex
public int getRowGroupIndex() -
getBatchSizesMgr
-
getOperatorContext
-
getFragmentContext
-
useBulkReader
public boolean useBulkReader()- Returns:
- true if Parquet reader Bulk processing is enabled; false otherwise
-
getReadState
-
setup
public void setup(OperatorContext operatorContext, OutputMutator output) throws ExecutionSetupException Prepare the Parquet reader. First determine the set of columns to read (the schema for this read.) Then, create a state object to track the read across calls to the reader next() method. Finally, create one of three readers to read batches depending on whether this scan is for only fixed-width fields, contains at least one variable-width field, or is a "mock" scan consisting only of null fields (fields in the SELECT clause but not in the Parquet file.)- Parameters:
operatorContext
- operator context for the readeroutput
- The place where output for a particular scan should be written. The record reader is responsible for mutating the set of schema values for that particular record.- Throws:
ExecutionSetupException
-
allocate
- Specified by:
allocate
in interfaceRecordReader
- Overrides:
allocate
in classAbstractRecordReader
- Throws:
OutOfMemoryException
-
next
public int next()Read the next record batch from the file using the reader and read state created previously.- Returns:
- The number of additional records added to the output.
-
close
public void close() -
getDefaultColumnsToRead
- Overrides:
getDefaultColumnsToRead
in classAbstractRecordReader
-
toString
- Overrides:
toString
in classAbstractRecordReader
-