java.lang.Object
org.apache.lucene.util.ScalarQuantizer
Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation.
Scalar quantization works by first calculating the quantiles of the float vector values. The
quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile]
are then used to scale the values into the range [0, 127] and bucketed into the nearest byte
values.
How Scalar Quantization Works
The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].
byte = (float - minQuantile) * 127/(maxQuantile - minQuantile) float = (maxQuantile - minQuantile)/127 * byte + minQuantile
This then means to multiply two float values together (e.g. dot_product) we can do the following:
float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile) float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2 let alpha = (maxQuantile - minQuantile)/127 float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
The expansion for square distance is much simpler:
square_distance = (float1 - float2)^2 (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2 = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile) this can be simplified to: = alpha^2 (byte1 - byte2)^2
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final float
private final float
private final float
private final float
private static final Random
static final int
private final float
-
Constructor Summary
ConstructorsConstructorDescriptionScalarQuantizer
(float minQuantile, float maxQuantile, float confidenceInterval) -
Method Summary
Modifier and TypeMethodDescriptionvoid
deQuantize
(byte[] src, float[] dest) Dequantize a byte vector into a float vectorstatic ScalarQuantizer
fromVectors
(FloatVectorValues floatVectorValues, float confidenceInterval) SeefromVectors(FloatVectorValues, float, int)
for details on how the quantiles are calculated.static ScalarQuantizer
fromVectors
(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount) This will read the float vector values and calculate the quantiles.(package private) static ScalarQuantizer
fromVectors
(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount, int quantizationSampleSize) float
float
float
(package private) static float[]
getUpperAndLowerQuantile
(float[] arr, float confidenceInterval) Takes an array of floats, sorted or not, and returns a minimum and maximum value.float
float
quantize
(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction) Quantize a float vector into a byte vectorfloat
recalculateCorrectiveOffset
(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction) Recalculate the old score corrective value given new current quantiles(package private) static int[]
reservoirSampleIndices
(int numFloatVecs, int sampleSize) (package private) static float[]
sampleVectors
(FloatVectorValues floatVectorValues, int[] vectorsToTake) toString()
-
Field Details
-
SCALAR_QUANTIZATION_SAMPLE_SIZE
public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE- See Also:
-
alpha
private final float alpha -
scale
private final float scale -
minQuantile
private final float minQuantile -
maxQuantile
private final float maxQuantile -
confidenceInterval
private final float confidenceInterval -
random
-
-
Constructor Details
-
ScalarQuantizer
public ScalarQuantizer(float minQuantile, float maxQuantile, float confidenceInterval) - Parameters:
minQuantile
- the lower quantile of the distributionmaxQuantile
- the upper quantile of the distributionconfidenceInterval
- The configured confidence interval used to calculate the quantiles.
-
-
Method Details
-
quantize
Quantize a float vector into a byte vector- Parameters:
src
- the source vectordest
- the destination vectorsimilarityFunction
- the similarity function used to calculate the quantile- Returns:
- the corrective offset that needs to be applied to the score
-
recalculateCorrectiveOffset
public float recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction) Recalculate the old score corrective value given new current quantiles- Parameters:
quantizedVector
- the old vectoroldQuantizer
- the old quantizersimilarityFunction
- the similarity function used to calculate the quantile- Returns:
- the new offset
-
deQuantize
public void deQuantize(byte[] src, float[] dest) Dequantize a byte vector into a float vector- Parameters:
src
- the source vectordest
- the destination vector
-
getLowerQuantile
public float getLowerQuantile() -
getUpperQuantile
public float getUpperQuantile() -
getConfidenceInterval
public float getConfidenceInterval() -
getConstantMultiplier
public float getConstantMultiplier() -
toString
-
reservoirSampleIndices
static int[] reservoirSampleIndices(int numFloatVecs, int sampleSize) -
sampleVectors
static float[] sampleVectors(FloatVectorValues floatVectorValues, int[] vectorsToTake) throws IOException - Throws:
IOException
-
fromVectors
public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval) throws IOException SeefromVectors(FloatVectorValues, float, int)
for details on how the quantiles are calculated. NOTE: If there are deleted vectors in the index, do not use this method, but instead usefromVectors(FloatVectorValues, float, int)
. This is because the totalVectorCount is used to account for deleted documents when sampling.- Throws:
IOException
-
fromVectors
public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount) throws IOException This will read the float vector values and calculate the quantiles. If the number of float vectors is less thanSCALAR_QUANTIZATION_SAMPLE_SIZE
then all the values will be read and the quantiles calculated. If the number of float vectors is greater thanSCALAR_QUANTIZATION_SAMPLE_SIZE
then a random sample ofSCALAR_QUANTIZATION_SAMPLE_SIZE
will be read and the quantiles calculated.- Parameters:
floatVectorValues
- the float vector values from which to calculate the quantilesconfidenceInterval
- the confidence interval used to calculate the quantilestotalVectorCount
- the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.- Returns:
- A new
ScalarQuantizer
instance - Throws:
IOException
- if there is an error reading the float vector values
-
fromVectors
static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount, int quantizationSampleSize) throws IOException - Throws:
IOException
-
getUpperAndLowerQuantile
static float[] getUpperAndLowerQuantile(float[] arr, float confidenceInterval) Takes an array of floats, sorted or not, and returns a minimum and maximum value. These values are such that they reside on the `(1 - confidenceInterval)/2` and `confidenceInterval/2` percentiles. Example: providing floats `[0..100]` and asking for `90` quantiles will return `5` and `95`.- Parameters:
arr
- array of floatsconfidenceInterval
- the configured confidence interval- Returns:
- lower and upper quantile values
-