CPU Specifics¶
Primitive functions¶
Hidet provides primitives to use the avx instructions in modern cpu. They includes
avx_f32x4_load(...): vectorized load 4 f32 values from memoryavx_f32x4_store(...): vectorized store 4 f32 values to memoryavx_f32x4_fmadd(...): vectorized fused multiply-add operationavx_f32x4_setzero(...): get the zero initialized vectoravx_f32x4_broadcast(...): broadcast a scalar to a vector
There are also corresponding f32x8 primitives.
Multi-threading¶
Hidet relies on the OpenMP to support multi-threading. To use the multi-threading, please specify the
p attribute of the hidet.lang.grid or hidet.lang.mapping.repeat functions.