DML_ELEMENT_WISE_QUANTIZE_LINEAR_OPERATOR_DESC structure (directml.h)

Article
01/09/2024

Performs the following linear quantization function on every element in InputTensor with respect to its corresponding element in ScaleTensor and ZeroPointTensor, placing the results in the corresponding element of OutputTensor.

// For uint8 output, Min = 0, Max = 255
// For int8 output, Min = -128, Max = 127
f(input, scale, zero_point) = clamp(round(input / scale) + zero_point, Min, Max)

Quantizing involves converting to a lower-precision data type in order to accelerate arithmetic. It's a common way to increase performance at the cost of precision. A group of 8-bit values can be computed faster than a group of 32-bit values can.

Syntax

struct DML_ELEMENT_WISE_QUANTIZE_LINEAR_OPERATOR_DESC {
  const DML_TENSOR_DESC *InputTensor;
  const DML_TENSOR_DESC *ScaleTensor;
  const DML_TENSOR_DESC *ZeroPointTensor;
  const DML_TENSOR_DESC *OutputTensor;
};

Members

InputTensor

Type: const DML_TENSOR_DESC*

The tensor containing the inputs.

ScaleTensor

Type: const DML_TENSOR_DESC*

The tensor containing the scales. If InputTensor is INT32, then ScaleTensor must be FLOAT32. Otherwise, ScaleTensor must have the same DataType as InputTensor.

ZeroPointTensor

Type: const DML_TENSOR_DESC*

The tensor containing the desired zero point for the quantization.

OutputTensor

Type: const DML_TENSOR_DESC*

The output tensor to write the results to.

Availability

This operator was introduced in DML_FEATURE_LEVEL_1_0.

Tensor constraints

InputTensor, OutputTensor, ScaleTensor, and ZeroPointTensor must have the same DimensionCount and Sizes.
OutputTensor and ZeroPointTensor must have the same DataType.

Tensor support

DML_FEATURE_LEVEL_6_2 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	1 to 8	FLOAT32, FLOAT16, INT32
ScaleTensor	Input	1 to 8	FLOAT32, FLOAT16
ZeroPointTensor	Optional input	1 to 8	INT8, UINT8
OutputTensor	Output	1 to 8	INT8, UINT8

DML_FEATURE_LEVEL_6_0 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	1 to 8	FLOAT32, FLOAT16, INT32
ScaleTensor	Input	1 to 8	FLOAT32, FLOAT16
ZeroPointTensor	Input	1 to 8	INT8, UINT8
OutputTensor	Output	1 to 8	INT8, UINT8

DML_FEATURE_LEVEL_3_0 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	1 to 8	FLOAT32, INT32
ScaleTensor	Input	1 to 8	FLOAT32
ZeroPointTensor	Input	1 to 8	INT8, UINT8
OutputTensor	Output	1 to 8	INT8, UINT8

DML_FEATURE_LEVEL_2_1 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	4	FLOAT32, INT32
ScaleTensor	Input	4	FLOAT32
ZeroPointTensor	Input	4	INT8, UINT8
OutputTensor	Output	4	INT8, UINT8

DML_FEATURE_LEVEL_1_0 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	4	FLOAT32
ScaleTensor	Input	4	FLOAT32
ZeroPointTensor	Input	4	UINT8
OutputTensor	Output	4	UINT8

Requirements

Requirement	Value
Header	directml.h

DML_ELEMENT_WISE_QUANTIZE_LINEAR_OPERATOR_DESC structure (directml.h)

Syntax

Members

Availability

Tensor constraints

Tensor support

DML_FEATURE_LEVEL_6_2 and above

DML_FEATURE_LEVEL_6_0 and above

DML_FEATURE_LEVEL_3_0 and above

DML_FEATURE_LEVEL_2_1 and above

DML_FEATURE_LEVEL_1_0 and above

Requirements

See also

Feedback

Feedback

Additional resources