Skip to content

rocFFT 1.0.18 for ROCm 5.3.0

Compare
Choose a tag to compare
@lawruble13 lawruble13 released this 30 Sep 19:25
11c649a

Changed

  • Runtime compilation cache now looks for environment variables XDG_CACHE_HOME (on Linux) and LOCALAPPDATA (on Windows) before falling back to HOME.

Optimizations

  • Optimized 2D R2C/C2R to use 2-kernel plans where possible.
  • Improved performance of the Bluestein algorithm.
  • Optimized sbcc-168 and 100 by using half-lds.

Fixed

  • Fixed occasional failures to parallelize runtime compilation of kernels.
    Failures would be retried serially and ultimately succeed, but this would take extra time.
  • Fixed failures of some R2C 3D transforms that use the unsupported TILE_UNALGNED SBRC kernels.
    An example is 98^3 R2C out-of-place.
  • Fixed bugs in SBRC_ERC type.