T
- the format plugin config for this readerpublic abstract class EasyFormatPlugin<T extends FormatPluginConfig> extends Object implements FormatPlugin
Provides a bridge between the legacy RecordReader
-style
readers and the newer ManagedReader
style. Over time, split the
class, or provide a cleaner way to handle the differences.
Modifier and Type | Class and Description |
---|---|
static class |
EasyFormatPlugin.EasyFormatConfig
Defines the static, programmer-defined options for this plugin.
|
static class |
EasyFormatPlugin.EasyFormatConfigBuilder |
Modifier and Type | Field and Description |
---|---|
protected T |
formatConfig |
Modifier | Constructor and Description |
---|---|
protected |
EasyFormatPlugin(String name,
DrillbitContext context,
org.apache.hadoop.conf.Configuration fsConf,
StoragePluginConfig storageConfig,
T formatConfig,
boolean readable,
boolean writable,
boolean blockSplittable,
boolean compressible,
List<String> extensions,
String defaultName)
Legacy constructor.
|
protected |
EasyFormatPlugin(String name,
EasyFormatPlugin.EasyFormatConfig config,
DrillbitContext context,
StoragePluginConfig storageConfig,
T formatConfig)
Revised constructor in which settings are gathered into a configuration object.
|
Modifier and Type | Method and Description |
---|---|
EasyFormatPlugin.EasyFormatConfig |
easyConfig() |
protected FileScanFramework.FileScanBuilder |
frameworkBuilder(OptionManager options,
EasySubScan scan)
Create the plugin-specific framework that manages the scan.
|
T |
getConfig() |
DrillbitContext |
getContext() |
org.apache.hadoop.conf.Configuration |
getFsConf() |
AbstractGroupScan |
getGroupScan(String userName,
FileSelection selection,
List<SchemaPath> columns) |
AbstractGroupScan |
getGroupScan(String userName,
FileSelection selection,
List<SchemaPath> columns,
MetadataProviderManager metadataProviderManager) |
FormatMatcher |
getMatcher() |
String |
getName() |
Set<StoragePluginOptimizerRule> |
getOptimizerRules() |
protected CloseableRecordBatch |
getReaderBatch(FragmentContext context,
EasySubScan scan) |
String |
getReaderOperatorType() |
RecordReader |
getRecordReader(FragmentContext context,
DrillFileSystem dfs,
FileWork fileWork,
List<SchemaPath> columns,
String userName)
Return a record reader for the specific file format, when using the original
ScanBatch scanner. |
RecordWriter |
getRecordWriter(FragmentContext context,
EasyWriter writer) |
protected ScanStats |
getScanStats(PlannerSettings settings,
EasyGroupScan scan) |
StatisticsRecordWriter |
getStatisticsRecordWriter(FragmentContext context,
EasyWriter writer) |
StoragePluginConfig |
getStorageConfig() |
AbstractWriter |
getWriter(PhysicalOperator child,
String location,
List<String> partitionColumns) |
CloseableRecordBatch |
getWriterBatch(FragmentContext context,
RecordBatch incoming,
EasyWriter writer) |
String |
getWriterOperatorType() |
protected void |
initScanBuilder(FileScanFramework.FileScanBuilder builder,
EasySubScan scan)
Initialize the scan framework builder with standard options.
|
boolean |
isBlockSplittable()
Whether or not you can split the format based on blocks within file
boundaries.
|
boolean |
isCompressible()
Indicates whether or not this format could also be in a compression
container (for example: csv.gz versus csv).
|
boolean |
isStatisticsRecordWriter(FragmentContext context,
EasyWriter writer) |
ManagedReader<? extends FileScanFramework.FileSchemaNegotiator> |
newBatchReader(EasySubScan scan,
OptionManager options) |
DrillStatsTable.TableStatistics |
readStatistics(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path statsTablePath) |
boolean |
supportsAutoPartitioning()
Indicates whether this FormatPlugin supports auto-partitioning for CTAS statements
|
boolean |
supportsFileImplicitColumns()
Whether this format plugin supports implicit file columns.
|
boolean |
supportsLimitPushdown()
Does this plugin support pushing the limit down to the batch reader? If so, then
the reader itself should have logic to stop reading the file as soon as the limit has been
reached.
|
boolean |
supportsPushDown()
Does this plugin support projection push down? That is, can the reader
itself handle the tasks of projecting table columns, creating null
columns for missing table columns, and so on?
|
boolean |
supportsRead() |
boolean |
supportsStatistics() |
boolean |
supportsWrite() |
protected boolean |
useEnhancedScan()
Choose whether to use the enhanced scan based on the row set and scan
framework, or the "traditional" ad-hoc structure based on ScanBatch.
|
void |
writeStatistics(DrillStatsTable.TableStatistics statistics,
org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path statsTablePath) |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getGroupScan, getGroupScan, getOptimizerRules
protected final T extends FormatPluginConfig formatConfig
protected EasyFormatPlugin(String name, DrillbitContext context, org.apache.hadoop.conf.Configuration fsConf, StoragePluginConfig storageConfig, T formatConfig, boolean readable, boolean writable, boolean blockSplittable, boolean compressible, List<String> extensions, String defaultName)
protected EasyFormatPlugin(String name, EasyFormatPlugin.EasyFormatConfig config, DrillbitContext context, StoragePluginConfig storageConfig, T formatConfig)
name
- name of the pluginconfig
- configuration options for this plugin which determine
developer-defined runtime behaviorcontext
- the global server-wide Drillbit contextstorageConfig
- the configuration for the storage plugin that owns this
format pluginformatConfig
- the Jackson-serialized format configuration as created
by the user in the Drill web console. Holds user-defined optionspublic org.apache.hadoop.conf.Configuration getFsConf()
getFsConf
in interface FormatPlugin
public DrillbitContext getContext()
getContext
in interface FormatPlugin
public EasyFormatPlugin.EasyFormatConfig easyConfig()
public String getName()
getName
in interface FormatPlugin
public boolean supportsLimitPushdown()
public boolean supportsPushDown()
true
if the plugin supports projection push-down,
false
if Drill should do the task by adding a project operatorpublic boolean supportsFileImplicitColumns()
true
if the plugin supports implicit file columns,
false
otherwisepublic boolean isBlockSplittable()
true
if splitable.public boolean isCompressible()
true
if it is compressiblepublic RecordReader getRecordReader(FragmentContext context, DrillFileSystem dfs, FileWork fileWork, List<SchemaPath> columns, String userName) throws ExecutionSetupException
ScanBatch
scanner.context
- fragment contextdfs
- Drill file systemfileWork
- metadata about the file to be scannedcolumns
- list of projected columns (or may just contain the wildcard)userName
- the name of the user running the queryExecutionSetupException
- for many reasonsprotected CloseableRecordBatch getReaderBatch(FragmentContext context, EasySubScan scan) throws ExecutionSetupException
ExecutionSetupException
protected boolean useEnhancedScan()
protected void initScanBuilder(FileScanFramework.FileScanBuilder builder, EasySubScan scan)
frameworkBuilder(OptionManager, EasySubScan)
method.
The plugin can then customize/revise options as needed.builder
- the scan framework builder you create in the
frameworkBuilder(OptionManager, EasySubScan)
methodscan
- the physical scan operator definition passed to
the frameworkBuilder(OptionManager, EasySubScan)
methodpublic ManagedReader<? extends FileScanFramework.FileSchemaNegotiator> newBatchReader(EasySubScan scan, OptionManager options) throws ExecutionSetupException
ExecutionSetupException
protected FileScanFramework.FileScanBuilder frameworkBuilder(OptionManager options, EasySubScan scan) throws ExecutionSetupException
scan
- the physical operation definition for the scan operation. Contains
one or more files to read. (The Easy format plugin works only for files.)ExecutionSetupException
- for all setup failurespublic boolean isStatisticsRecordWriter(FragmentContext context, EasyWriter writer)
public RecordWriter getRecordWriter(FragmentContext context, EasyWriter writer) throws IOException
IOException
public StatisticsRecordWriter getStatisticsRecordWriter(FragmentContext context, EasyWriter writer) throws IOException
IOException
public CloseableRecordBatch getWriterBatch(FragmentContext context, RecordBatch incoming, EasyWriter writer) throws ExecutionSetupException
ExecutionSetupException
protected ScanStats getScanStats(PlannerSettings settings, EasyGroupScan scan)
public AbstractWriter getWriter(PhysicalOperator child, String location, List<String> partitionColumns)
getWriter
in interface FormatPlugin
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns) throws IOException
getGroupScan
in interface FormatPlugin
IOException
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, MetadataProviderManager metadataProviderManager) throws IOException
getGroupScan
in interface FormatPlugin
IOException
public T getConfig()
getConfig
in interface FormatPlugin
public StoragePluginConfig getStorageConfig()
getStorageConfig
in interface FormatPlugin
public boolean supportsRead()
supportsRead
in interface FormatPlugin
public boolean supportsWrite()
supportsWrite
in interface FormatPlugin
public boolean supportsAutoPartitioning()
FormatPlugin
supportsAutoPartitioning
in interface FormatPlugin
public FormatMatcher getMatcher()
getMatcher
in interface FormatPlugin
public Set<StoragePluginOptimizerRule> getOptimizerRules()
getOptimizerRules
in interface FormatPlugin
public String getReaderOperatorType()
public String getWriterOperatorType()
public boolean supportsStatistics()
supportsStatistics
in interface FormatPlugin
public DrillStatsTable.TableStatistics readStatistics(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path statsTablePath) throws IOException
readStatistics
in interface FormatPlugin
IOException
public void writeStatistics(DrillStatsTable.TableStatistics statistics, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path statsTablePath) throws IOException
writeStatistics
in interface FormatPlugin
IOException
Copyright © 1970 The Apache Software Foundation. All rights reserved.