java.lang.Object

org.apache.drill.exec.record.RecordBatchSizer.ColumnSize

Enclosing class:: RecordBatchSizer

public class RecordBatchSizer.ColumnSize extends Object

Column size information.

Field Summary

Fields

Modifier and Type

Field

Description

final MaterializedField

metadata

final String

prefix
Constructor Summary

Constructors

Constructor

Description

ColumnSize(ValueVector v, String prefix)
Method Summary

Modifier and Type

Method

Description

void

allocateVector(ValueVector vector, int recordCount)

void

buildVectorInitializer(VectorInitializer initializer)

Add a single vector initializer to a collection for the entire batch.

int

getAllocSizePerEntry()

This returns actual entry size if rowCount > 0 or allocation size otherwise.

float

getCardinality()

Map<String,RecordBatchSizer.ColumnSize>

getChildren()

int

getDataSizePerEntry()

This is the average actual per entry data size in bytes.

int

getElementCount()

int

getNetSizePerEntry()

This is the average per entry size of just pure data plus overhead of additional vectors we add on top like bits vector, offset vector etc.

int

getStdDataSizePerEntry()

std pure data size per entry from Drill metadata, based on type.

int

getStdNetOrNetSizePerEntry()

If there is an accurate std net size, that is returned.

int

getStdNetSizePerEntry()

std net size per entry taking into account additional metadata vectors we add on top for variable length, cardinality etc.

int

getTotalDataSize()

This is the total data size for the column, including children for map columns.

int

getTotalNetSize()

This is the total net size for the column, including children for map columns.

int

getValueCount()

boolean

hasStdDataSize()

Returns true if there is an accurate std size.

boolean

isComplex()

boolean

isRepeatedList()

boolean

isVariableWidth()

String

toString()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- prefix
  
  public final String prefix
- metadata
  
  public final MaterializedField metadata
Constructor Details
- ColumnSize
  
  public ColumnSize(ValueVector v, String prefix)
Method Details
- hasStdDataSize
  
  public boolean hasStdDataSize()
  
  Returns true if there is an accurate std size. Otherwise it returns false.
  
  Returns:
  
  True if there is an accurate std size. Otherwise it returns false.
- getStdDataSizePerEntry
  
  public int getStdDataSizePerEntry()
  
  std pure data size per entry from Drill metadata, based on type. Does not include metadata vector overhead we add for cardinality, variable length etc. For variable-width columns, we use 50 as std size for entry width. For repeated column, we assume repetition of 10.
- getStdNetSizePerEntry
  
  public int getStdNetSizePerEntry()
  
  std net size per entry taking into account additional metadata vectors we add on top for variable length, cardinality etc. For variable-width columns, we use 50 as std data size for entry width. For repeated column, we assume repetition of 10.
- getDataSizePerEntry
  
  public int getDataSizePerEntry()
  
  This is the average actual per entry data size in bytes. Does not include any overhead of metadata vectors. For repeated columns, it is average for the repeated array, not individual entry in the array.
- getNetSizePerEntry
  
  public int getNetSizePerEntry()
  
  This is the average per entry size of just pure data plus overhead of additional vectors we add on top like bits vector, offset vector etc. This size is larger than the actual data size since this size includes per- column overhead for additional vectors we add for cardinality, variable length etc.
- getAllocSizePerEntry
  
  public int getAllocSizePerEntry()
  
  This returns actual entry size if rowCount > 0 or allocation size otherwise. Use this for the cases when you might get empty batches with schema and you still need to do memory calculations based on just schema.
- getStdNetOrNetSizePerEntry
  
  public int getStdNetOrNetSizePerEntry()
  
  If there is an accurate std net size, that is returned. Otherwise the net size is returned.
  
  Returns:
  
  If there is an accurate std net size, that is returned. Otherwise the net size is returned.
- getTotalDataSize
  
  public int getTotalDataSize()
  
  This is the total data size for the column, including children for map columns. Does not include any overhead of metadata vectors.
- getTotalNetSize
  
  public int getTotalNetSize()
  
  This is the total net size for the column, including children for map columns. Includes overhead of metadata vectors.
- getValueCount
  
  public int getValueCount()
- getElementCount
  
  public int getElementCount()
- getCardinality
  
  public float getCardinality()
- isVariableWidth
  
  public boolean isVariableWidth()
- getChildren
  
  public Map<String,RecordBatchSizer.ColumnSize> getChildren()
- isComplex
  
  public boolean isComplex()
- isRepeatedList
  
  public boolean isRepeatedList()
- allocateVector
  
  public void allocateVector(ValueVector vector, int recordCount)
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- buildVectorInitializer
  
  public void buildVectorInitializer(VectorInitializer initializer)
  
  Add a single vector initializer to a collection for the entire batch. Uses the observed column size information to predict the size needed when allocating a new vector for the same data. Adds a hint only for variable-width or repeated types; no extra information is needed for fixed width, non-repeated columns.
  
  Parameters:
  
  initializer - the vector initializer to hold the hints for this column

Class RecordBatchSizer.ColumnSize

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

prefix

metadata

Constructor Details

ColumnSize

Method Details

hasStdDataSize

getStdDataSizePerEntry

getStdNetSizePerEntry

getDataSizePerEntry

getNetSizePerEntry

getAllocSizePerEntry

getStdNetOrNetSizePerEntry

getTotalDataSize

getTotalNetSize

getValueCount

getElementCount

getCardinality

isVariableWidth

getChildren

isComplex

isRepeatedList

allocateVector

toString

buildVectorInitializer