hidet.ops¶
Todo
We are still working on the documentation of operators.
Functions:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Check if all of the elements on the given axis evaluates to True. |
|
|
|
|
|
Check if any of the elements on the given axis evaluates to True. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Barrier operator is an identity operator and return the same tensor as input. |
|
Batched matrix multiplication. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Group norm. |
|
|
|
|
|
|
|
|
|
|
|
Instance norm. |
|
|
|
|
|
|
|
Layer norm. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
LP norm. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rearrange a tensor. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tile a tensor. |
|
|
|
|
|
|
|
|
|
|
|
|
|
- hidet.ops.adaptive_avg_pool1d(x, output_size)¶
- hidet.ops.adaptive_avg_pool2d(x, output_size)¶
- hidet.ops.adaptive_avg_pool3d(x, output_size)¶
- hidet.ops.adaptive_max_pool1d(x, output_size)¶
- hidet.ops.adaptive_max_pool2d(x, output_size)¶
- hidet.ops.adaptive_max_pool3d(x, output_size)¶
- hidet.ops.add(x, y)¶
- hidet.ops.all(x, /, *, axis=None, keepdims=False)¶
Check if all of the elements on the given axis evaluates to True.
- Parameters:
x (Tensor) – The input tensor.
axis (int or Sequence[int], optional) – The axis or axes along which to perform the logical AND. None indicates to perform the reduction on the whole tensor. When an integer or a sequence of integers are given, they must be in range [-N, N), where N is the rank of the input tensor.
keepdims (bool, default=False) – Whehter to keep the dimension.
- Returns:
ret – The result of logical AND reduction with bool data type.
- Return type:
- hidet.ops.all_gather(x, nranks, comm_id=0)¶
- hidet.ops.all_reduce(x, op, comm_id=0)¶
- hidet.ops.any(x, /, *, axis=None, keepdims=False)¶
Check if any of the elements on the given axis evaluates to True.
- Parameters:
x (Tensor) – The input tensor.
axis (int or Sequence[int], optional) – The axis or axes along which to perform the logical OR. None indicates to perform the reduction on the whole tensor. When an integer or a sequence of integers are given, they must be in range [-N, N), where N is the rank of the input tensor.
keepdims (bool, default=False) – Whehter to keep the dimension.
- Returns:
ret – The result of logical OR reduction with bool data type.
- Return type:
- hidet.ops.argmax(x, dim, keep_dim=False)¶
- hidet.ops.argmin(x, dim, keep_dim=False)¶
- hidet.ops.attention(q, k, v, mask=None, is_causal=False)¶
- hidet.ops.avg_pool2d(x, kernel, stride, padding, ceil_mode=False)¶
- hidet.ops.barrier(x)¶
Barrier operator is an identity operator and return the same tensor as input. During graph-level optimizations, this operator prevents the fusion of producer and consumer of the input tensor and output tensor, respectively. This operator will be eliminated at the end of graph-level optimizations.
- hidet.ops.batch_matmul(a, b, mma='simt')¶
Batched matrix multiplication.
- Parameters:
a (Tensor) – The lhs operand with shape [batch_size, m_size, k_size].
b (Tensor) – The rhs operand with shape [batch_size, k_size, n_size].
mma (str) –
The matrix-multiplication-accumulate (mma) in warp level:
- ’simt’:
Use cuda core to do the warp-level mma (simt stands for single-instruction-multiple-threads).
- ’mma’:
Use mma instruction.
See also: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions
- Returns:
c – The result tensor of matrix multiplication.
- Return type:
- hidet.ops.batch_norm_infer(x, running_mean, running_var, epsilon=1e-05, axis=1)¶
- hidet.ops.clamp(x, min, max)¶
- hidet.ops.clip(x, min_val, max_val)¶
- hidet.ops.concat(tensors, axis)¶
- hidet.ops.conv1d(data, weight, stride=1, dilations=1, groups=1)¶
- hidet.ops.conv1d_gemm(data, weight, stride, dilation=1, groups=1)¶
- hidet.ops.conv1d_transpose(data, weight, stride=1, padding=0, groups=1, output_padding=0)¶
- hidet.ops.conv2d(data, weight, stride=(1, 1), dilations=(1, 1), groups=1, padding=(0, 0))¶
- hidet.ops.conv2d_channel_last(data, weight, stride=(1, 1), dilations=(1, 1), groups=1, padding=(0, 0))¶
- hidet.ops.conv2d_gemm(data, weight, stride, dilations, groups=1)¶
- hidet.ops.conv2d_gemm_fp16(img, weight, padding=0, stride=(1, 1), dilations=(1, 1), groups=1, parallel_k_parts=1, disable_cp_async=False)¶
- hidet.ops.conv2d_gemm_fp16_channel_last(img, weight, padding, stride, dilations, groups, parallel_k_parts=1, disable_cp_async=False)¶
- hidet.ops.conv2d_gemm_image_transform(x, kernel, stride, dilations, groups=1)¶
- hidet.ops.conv2d_transpose(data, weight, stride, padding, groups=1, output_padding=0)¶
- hidet.ops.conv2d_transpose_gemm(data, weight, stride, padding, groups=1, output_padding=0)¶
- hidet.ops.conv3d(data, weight, stride=(1, 1, 1), dilations=(1, 1, 1), groups=1)¶
- hidet.ops.conv3d_gemm(data, weight, stride, dilations, groups=1)¶
- hidet.ops.conv3d_transpose(data, weight, stride=(1, 1, 1), padding=(0, 0, 0), groups=1, output_padding=0)¶
- hidet.ops.conv_pad(data, pads, value=0.0)¶
- hidet.ops.cumsum(x, dim, exclusive=False, reverse=False)¶
- hidet.ops.divide(x, y)¶
- hidet.ops.full(shape, value, dtype=None, device='cpu')¶
- hidet.ops.fused_operator(*inputs, fused_graph, anchor=None)¶
- hidet.ops.gather(data, indices, axis=0)¶
- hidet.ops.gelu(x, approximate=False)¶
- hidet.ops.group_norm(x, num_groups, epsilon=1e-05, accumulate_dtype='float32')¶
Group norm.
- hidet.ops.hardshrink(x, lambda_val)¶
- hidet.ops.hardtanh(x, min_val, max_val)¶
- hidet.ops.instance_norm(x, epsilon=1e-05, accumulate_dtype='float32')¶
Instance norm.
- hidet.ops.layer_norm(x, num_last_dims=1, epsilon=1e-05, accumulate_dtype='float32')¶
Layer norm.
- Parameters:
x (Tensor) – The data to be normalized.
num_last_dims (int) – The number of dimensions to be normalized, starting from the end dimension of x.
epsilon (float) – The epsilon added to variance.
accumulate_dtype (str) – The precision used for accumulation during reduction
- Returns:
ret – The normalized tensor.
- Return type:
- hidet.ops.linspace(start, stop, /, num, *, dtype=None, device='cpu', endpoint=True)¶
- Return type:
- hidet.ops.lp_norm(x, p=2.0, dim=1, eps=1e-12)¶
LP norm.
- hidet.ops.matmul(a, b, require_prologue=False)¶
- hidet.ops.max(x, dims, keep_dim=False)¶
- hidet.ops.max_pool2d(x, kernel, stride, padding, ceil_mode=False)¶
- hidet.ops.maximum(a, b, *others)¶
- hidet.ops.mean(x, dims, keep_dim=False)¶
- hidet.ops.min(x, dims, keep_dim=False)¶
- hidet.ops.minimum(a, b, *others)¶
- hidet.ops.multiply(x, y)¶
- hidet.ops.pad(data, pads, mode='constant', value=0.0)¶
- hidet.ops.permute_dims(x, /, axes)¶
- hidet.ops.prod(x, dims, keep_dim=False)¶
- hidet.ops.rearrange(x, plan)¶
Rearrange a tensor. This task is a general task of squeeze, unsqueeze, flatten, and perm.
- Parameters:
x (Tensor) – The input tensor.
plan (List[List[int]]) – The rearrange plan.
- Returns:
ret – The task to conduct rearrangement.
- Return type:
Examples
squeeze([1, 1, 2, 3], dims=[0, 1]) = rearrange([1, 1, 2, 3], plan=[[2], [3]]) => Tensor([2, 3])
unsqueeze([2, 3], dims=[0, 1]) = rearrange([2, 3], plan=[[], [], [0], [1]]) => Tensor([1, 1, 2, 3])
flatten([2, 3, 4, 5], start_dim=1, end_dim=2) = rearrange([2, 3, 4, 5], plan=[[0], [1, 2], [3]]) => Tensor([2, 12, 5])
- hidet.ops.reduce_scatter(x, op, comm_id=0)¶
- hidet.ops.resize2d(data, *, size=None, scale_factor=None, method='nearest', coordinate_transformation_mode='half_pixel', rounding_method='round_prefer_floor', roi=None, cubic_alpha=-0.75, cubic_exclude=False, extrapolation_value=None, recompute_scale_factor=None)¶
- Parameters:
data (Tensor) –
size (Sequence[int] | None) –
scale_factor (float | Sequence[float] | None) –
method (str) –
coordinate_transformation_mode (str) –
rounding_method (str) –
roi (Optional) –
cubic_alpha (float | None) –
cubic_exclude (bool | None) –
extrapolation_value (float | None) –
recompute_scale_factor (bool | None) –
- Return type:
- hidet.ops.roll(x, shifts, dims=None)¶
- hidet.ops.set_strided_slice(data, starts, ends, strides=None, setvalue=0.0)¶
- hidet.ops.softplus(x, beta, threshold_val)¶
- hidet.ops.softshrink(x, lambda_val)¶
- hidet.ops.split(data, parts_or_sections, axis=0)¶
- hidet.ops.squeeze(x, dims)¶
- hidet.ops.std(x, dims, keep_dim=False)¶
- hidet.ops.strided_slice(data, starts, ends, axes=None, strides=None)¶
- hidet.ops.subtract(x, y)¶
- hidet.ops.sum(x, dims, keep_dim=False)¶
- hidet.ops.symmetric_dequantize(wq, scale, dims=-1)¶
- hidet.ops.symmetric_quantize(w, quant_type='int8', dims=-1)¶
- hidet.ops.take(data, indices, axis=0)¶
- hidet.ops.tile(data, repeats)¶
Tile a tensor. See https://numpy.org/doc/stable/reference/generated/numpy.tile.html.
- Parameters:
data (Tensor) – The input tensor to be tiled.
repeats (Sequence[int]) – A list of integers to represent the number of repeats for each dimension. Must have len(repeats) == len(data.shape).
- Returns:
ret – The tiled tensor, with shape [a * b for a, b in zip(data.shape, repeats)].
- Return type:
- hidet.ops.transfer(x, dst_device)¶
- hidet.ops.transpose(x, axes=None)¶
- hidet.ops.tri(n, m=None, k=0, dtype=hidet.float32, device='cpu')¶
- hidet.ops.unsqueeze(x, dims)¶
- hidet.ops.var(x, dims, keep_dim=False)¶