public class RecordBatchSizer extends Object
Modifier and Type | Class and Description |
---|---|
class |
RecordBatchSizer.ColumnSize
Column size information.
|
Modifier and Type | Field and Description |
---|---|
int |
maxSize
Maximum width of a column; used for memory estimation in case of Varchars
|
int |
nullableCount
Count the nullable columns; used for memory estimation
|
SelectionVector2 |
sv2 |
Constructor and Description |
---|
RecordBatchSizer(RecordBatch batch) |
RecordBatchSizer(VectorAccessible va)
Create empirical metadata for a record batch given a vector accessible
(basically, an iterator over the vectors in the batch.)
|
RecordBatchSizer(VectorAccessible va,
SelectionVector2 sv2)
Create empirical metadata for a record batch given a vector accessible
(basically, an iterator over the vectors in the batch) along with a
selection vector for those records.
|
Modifier and Type | Method and Description |
---|---|
void |
allocateVectors(VectorContainer container,
int recordCount) |
void |
applySv2() |
VectorInitializer |
buildVectorInitializer()
The column size information gathered here represents empirically-derived
schema metadata.
|
Map<String,RecordBatchSizer.ColumnSize> |
columns() |
List<RecordBatchSizer.ColumnSize> |
columnsList()
This is a convenience method to get the sizes of columns in the same order that the corresponding value vectors
are stored within a
VectorAccessible . |
long |
getActualSize() |
int |
getAvgDensity() |
RecordBatchSizer.ColumnSize |
getColumn(String name) |
int |
getGrossRowWidth() |
int |
getMaxAvgColumnSize() |
long |
getNetBatchSize() |
int |
getNetRowWidth() |
int |
getNetRowWidthCap50()
Compute the "real" width of the row, taking into account each varchar column size
(historically capped at 50, and rounded up to power of 2 to match drill buf allocation)
and null marking columns.
|
int |
getRowAllocWidth() |
static int |
getStdNetSizePerEntryCommon(TypeProtos.MajorType majorType,
boolean isOptional,
boolean isRepeated,
boolean isRepeatedList,
Map<String,RecordBatchSizer.ColumnSize> children) |
int |
getStdRowWidth() |
boolean |
hasSv2() |
static long |
multiplyByFactor(long size,
double factor) |
static long |
multiplyByFactors(long size,
double... factors) |
int |
rowCount() |
static int |
safeDivide(int num,
double denom) |
static int |
safeDivide(int num,
float denom) |
static int |
safeDivide(int num,
int denom) |
static int |
safeDivide(long num,
long denom) |
String |
toString() |
public SelectionVector2 sv2
public int maxSize
public int nullableCount
public RecordBatchSizer(RecordBatch batch)
public RecordBatchSizer(VectorAccessible va)
va
- iterator over the batch's vectorspublic RecordBatchSizer(VectorAccessible va, SelectionVector2 sv2)
va
- iterator over the batch's vectorssv2
- selection vector associated with this batchpublic static long multiplyByFactors(long size, double... factors)
public static long multiplyByFactor(long size, double factor)
public static int getStdNetSizePerEntryCommon(TypeProtos.MajorType majorType, boolean isOptional, boolean isRepeated, boolean isRepeatedList, Map<String,RecordBatchSizer.ColumnSize> children)
public RecordBatchSizer.ColumnSize getColumn(String name)
public void applySv2()
public static int safeDivide(long num, long denom)
public static int safeDivide(int num, int denom)
public static int safeDivide(int num, float denom)
public static int safeDivide(int num, double denom)
public int rowCount()
public int getStdRowWidth()
public int getRowAllocWidth()
public long getActualSize()
public int getGrossRowWidth()
public int getAvgDensity()
public int getNetRowWidth()
public Map<String,RecordBatchSizer.ColumnSize> columns()
public List<RecordBatchSizer.ColumnSize> columnsList()
VectorAccessible
.VectorAccessible
.public int getNetRowWidthCap50()
public boolean hasSv2()
public long getNetBatchSize()
public int getMaxAvgColumnSize()
public VectorInitializer buildVectorInitializer()
public void allocateVectors(VectorContainer container, int recordCount)
Copyright © 1970 The Apache Software Foundation. All rights reserved.