Class LogFormatPlugin

All Implemented Interfaces:
FormatPlugin

public class LogFormatPlugin extends EasyFormatPlugin<LogFormatConfig>
  • Field Details

  • Constructor Details

  • Method Details

    • configureScan

      protected void configureScan(FileScanLifecycleBuilder builder, EasySubScan scan)
      Build a file scan framework for this plugin.

      This plugin was created before the concept of "provided schema" was available. This plugin does, however, support a provided schema and table properties. The code here handles the various cases.

      For the regex and max errors:

      • Use the table property, if a schema is provided and the table property is set.
      • Else, use the format config property.

      For columns:

      • If no schema is provided (or the schema contains only table properties with no columns), use the column names and types from the plugin config.
      • If a schema is provided, and the plugin defines columns, use the column types from the provided schema. Columns are matched by name. Provided schema type override any types specified in the plugin.
      • If a schema is provided, and the plugin config defines no columns, use the column names and types from the provided schema. The columns are assumed to appear in the same order as regex fields.
      • If the regex has more groups than either schema has columns, fill the extras with field_n of type VARCHAR.

      Typical use cases:

      • Minimum config: only a regex in either plugin config or table properties.
      • Plugin config defines regex, field names and types. (The typical approach in Drill 1.16 and before.
      • Plugin config defines the regex and field names. The provided schema defines types. (Separates physical and logical table definitions.
      • Provided schema defines the regex and columns. May simplify configuration as all table information is in one place. Allows different regex patterns for different tables of the same file suffix.
      Overrides:
      configureScan in class EasyFormatPlugin<LogFormatConfig>
      Parameters:
      builder - the builder with default options already set, and which allows the plugin implementation to set others
    • maxErrors

      public int maxErrors(TupleMetadata providedSchema)