Class TextFormatPlugin
java.lang.Object
org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin<TextFormatConfig>
org.apache.drill.exec.store.easy.text.TextFormatPlugin
- All Implemented Interfaces:
FormatPlugin
Text format plugin for CSV and other delimited text formats.
Allows use of a "provided schema", including using table properties
on that schema to override "static" ("or default") properties
defined in the plugin config. Allows, say, having ".csv" files
in which some have no headers (the default) and some do have
headers (as specified via table properties in the provided schema.)
Makes use of the scan framework and the result set loader mechanism to allow tight control of the size of produced batches (as well as to support provided schema.)
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
EasyFormatPlugin.EasyFormatConfig, EasyFormatPlugin.EasyFormatConfigBuilder, EasyFormatPlugin.ScanFrameworkVersion
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
static final String
static final String
static final String
static final int
static final int
static final char
static final String
static final String
static final String
static final String
static final String
static final String
static final String
Fields inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
formatConfig
-
Constructor Summary
ConstructorDescriptionTextFormatPlugin
(String name, DrillbitContext context, org.apache.hadoop.conf.Configuration fsConf, StoragePluginConfig config, TextFormatConfig formatPluginConfig) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
configureScan
(FileScanLifecycleBuilder builder, EasySubScan scan) Configure an EVF (v2) scan, which must at least include the factory to create readers.getGroupScan
(String userName, FileSelection selection, List<SchemaPath> columns, MetadataProviderManager metadataProviderManager) getGroupScan
(String userName, FileSelection selection, List<SchemaPath> columns, OptionManager options, MetadataProviderManager metadataProviderManager) getRecordWriter
(FragmentContext context, EasyWriter writer) protected ScanStats
getScanStats
(PlannerSettings settings, EasyGroupScan scan) Methods inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
easyConfig, frameworkBuilder, getConfig, getContext, getFsConf, getGroupScan, getMatcher, getName, getOptimizerRules, getReaderBatch, getReaderOperatorType, getRecordReader, getStatisticsRecordWriter, getStorageConfig, getWriter, getWriterBatch, getWriterOperatorType, initScanBuilder, isBlockSplittable, isCompressible, isStatisticsRecordWriter, newBatchReader, readStatistics, scanVersion, supportsAutoPartitioning, supportsFileImplicitColumns, supportsLimitPushdown, supportsPushDown, supportsRead, supportsStatistics, supportsWrite, writeStatistics
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.drill.exec.store.dfs.FormatPlugin
getGroupScan, getOptimizerRules
-
Field Details
-
MAXIMUM_NUMBER_COLUMNS
public static final int MAXIMUM_NUMBER_COLUMNS- See Also:
-
MAX_CHARS_PER_COLUMN
public static final int MAX_CHARS_PER_COLUMN- See Also:
-
NULL_CHAR
public static final char NULL_CHAR- See Also:
-
TEXT_PREFIX
-
HAS_HEADERS_PROP
-
SKIP_FIRST_LINE_PROP
-
DELIMITER_PROP
-
COMMENT_CHAR_PROP
-
QUOTE_PROP
-
QUOTE_ESCAPE_PROP
-
LINE_DELIM_PROP
-
TRIM_WHITESPACE_PROP
-
PARSE_UNESCAPED_QUOTES_PROP
-
WRITER_OPERATOR_TYPE
- See Also:
-
-
Constructor Details
-
TextFormatPlugin
public TextFormatPlugin(String name, DrillbitContext context, org.apache.hadoop.conf.Configuration fsConf, StoragePluginConfig config, TextFormatConfig formatPluginConfig)
-
-
Method Details
-
getGroupScan
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, MetadataProviderManager metadataProviderManager) throws IOException - Specified by:
getGroupScan
in interfaceFormatPlugin
- Overrides:
getGroupScan
in classEasyFormatPlugin<TextFormatConfig>
- Throws:
IOException
-
getGroupScan
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, OptionManager options, MetadataProviderManager metadataProviderManager) throws IOException - Throws:
IOException
-
configureScan
Description copied from class:EasyFormatPlugin
Configure an EVF (v2) scan, which must at least include the factory to create readers.- Overrides:
configureScan
in classEasyFormatPlugin<TextFormatConfig>
- Parameters:
builder
- the builder with default options already set, and which allows the plugin implementation to set others
-
getRecordWriter
- Overrides:
getRecordWriter
in classEasyFormatPlugin<TextFormatConfig>
- Throws:
IOException
-
getScanStats
- Overrides:
getScanStats
in classEasyFormatPlugin<TextFormatConfig>
-