Interface OperatorExec

All Known Implementing Classes:
ScanOperatorExec

public interface OperatorExec
Core protocol for a Drill operator execution.

Lifecycle

  • Creation via an operator-specific constructor in the corresponding RecordBatchCreator.
  • bind() called to provide the operator services.
  • buildSchema() called to define the schema before fetching the first record batch.
  • next() called repeatedly to prepare each new record batch until EOF or until cancellation.
  • cancel() called if the operator should quit early.
  • close() called to release resources. Note that close() is called in response to:
    • EOF
    • After cancel()
    • After an exception is thrown.

Error Handling

Any method can throw an (unchecked) exception. (Drill does not use checked exceptions.) Preferably, the code will throw a UserException that explains the error to the user. If any other kind of exception is thrown, then the enclosing class wraps it in a generic UserException that indicates that "something went wrong", which is less than ideal.

Result Set

The operator "publishes" a result set in response to returning true from next() by populating a BatchAccesor provided via batchAccessor(). For compatibility with other Drill operators, the set of vectors within the batch must be the same from one batch to the next.
  • Method Summary

    Modifier and Type
    Method
    Description
    Provides a generic access mechanism to the batch's output data.
    void
    Bind this operator to the context.
    boolean
    Retrieves the schema of the batch before the first actual batch of data.
    void
    Alerts the operator that the query was cancelled.
    void
    Close the operator by releasing all resources that the operator held.
    boolean
    Retrieves the next batch of data.
  • Method Details

    • bind

      void bind(OperatorContext context)
      Bind this operator to the context. The context provides access to per-operator, per-fragment and per-Drillbit services. Also provides access to the operator definition (AKA "pop config") for this operator.
      Parameters:
      context - operator context
    • batchAccessor

      BatchAccessor batchAccessor()
      Provides a generic access mechanism to the batch's output data. This method is called after a successful return from buildSchema() and next(). The batch itself can be held in a standard VectorContainer, or in some other structure more convenient for this operator.
      Returns:
      the access for the batch's output container
    • buildSchema

      boolean buildSchema()
      Retrieves the schema of the batch before the first actual batch of data. The schema is returned via an empty batch (no rows, only schema) from batchAccessor().
      Returns:
      true if a schema is available, false if the operator reached EOF before a schema was found
    • next

      boolean next()
      Retrieves the next batch of data. The data is returned via the batchAccessor() method.
      Returns:
      true if another batch of data is available, false if EOF was reached and no more data is available
    • cancel

      void cancel()
      Alerts the operator that the query was cancelled. Generally optional, but allows the operator to realize that a cancellation was requested.
    • close

      void close()
      Close the operator by releasing all resources that the operator held. Called after cancel() and after batchAccessor() or next() returns false.

      Note that there may be a significant delay between the last call to next() and the call to close() during which downstream operators do their work. A tidy operator will release resources immediately after EOF to avoid holding onto memory or other resources that could be used by downstream operators.