public final class BatchSizingMemoryUtil extends Object
Modifier and Type | Class and Description |
---|---|
static class |
BatchSizingMemoryUtil.ColumnMemoryUsageInfo
A container class to hold a column batch memory usage information.
|
static class |
BatchSizingMemoryUtil.VectorMemoryUsageInfo
Container class which holds memory usage information about a variable length
ValueVector ;
all values are in bytes. |
Modifier and Type | Field and Description |
---|---|
static int |
BYTE_VALUE_WIDTH
BYTE in-memory width
|
static int |
DEFAULT_VL_COLUMN_AVG_PRECISION
Default variable length column average precision;
computed in such a way that 64k values will fit within one MB to minimize internal fragmentation
|
static int |
INT_VALUE_WIDTH
INT in-memory width
|
Modifier and Type | Method and Description |
---|---|
static boolean |
canAddNewData(BatchSizingMemoryUtil.ColumnMemoryUsageInfo columnMemoryUsage,
long newBitsMemory,
long newOffsetsMemory,
long newDataMemory)
This method will also load detailed information about this column's current memory usage (with regard
to the value vectors).
|
static long |
computeFixedLengthVectorMemory(ParquetColumnMetadata column,
int valueCount) |
static long |
computeVariableLengthVectorMemory(ParquetColumnMetadata column,
long averagePrecision,
int valueCount) |
static int |
getAvgVariableLengthColumnTypePrecision(ParquetColumnMetadata column)
This method will return a default value for variable columns; it aims at minimizing internal fragmentation.
|
static int |
getFixedColumnTypePrecision(ParquetColumnMetadata column) |
static void |
getMemoryUsage(ValueVector sourceVector,
int currValueCount,
BatchSizingMemoryUtil.VectorMemoryUsageInfo vectorMemoryUsage)
Load memory usage information for a variable length value vector
|
public static final int BYTE_VALUE_WIDTH
public static final int INT_VALUE_WIDTH
public static final int DEFAULT_VL_COLUMN_AVG_PRECISION
public static boolean canAddNewData(BatchSizingMemoryUtil.ColumnMemoryUsageInfo columnMemoryUsage, long newBitsMemory, long newOffsetsMemory, long newDataMemory)
columnMemoryUsage
- container which contains column's memory usage information (usage information will
be automatically updated by this method)newBitsMemory
- New nullable data which might be inserted when processing a new input chunknewOffsetsMemory
- New offsets data which might be inserted when processing a new input chunknewDataMemory
- New data which might be inserted when processing a new input chunkpublic static void getMemoryUsage(ValueVector sourceVector, int currValueCount, BatchSizingMemoryUtil.VectorMemoryUsageInfo vectorMemoryUsage)
sourceVector
- source value vectorcurrValueCount
- current value countvectorMemoryUsage
- result object which contains source vector memory usage informationpublic static int getFixedColumnTypePrecision(ParquetColumnMetadata column)
column
- fixed column's metadatapublic static int getAvgVariableLengthColumnTypePrecision(ParquetColumnMetadata column)
Note that the TypeHelper
uses a large default value which might not be always appropriate.
column
- fixed column's metadatapublic static long computeFixedLengthVectorMemory(ParquetColumnMetadata column, int valueCount)
column
- column's metadatavalueCount
- number of column valuespublic static long computeVariableLengthVectorMemory(ParquetColumnMetadata column, long averagePrecision, int valueCount)
column
- length column's metadataaveragePrecision
- VL column average precisionvalueCount
- number of column valuesCopyright © 1970 The Apache Software Foundation. All rights reserved.