Skip to content

rocFFT 1.0.25 for ROCm 6.0.0

Compare
Choose a tag to compare
@rocm-ci rocm-ci released this 15 Dec 18:30
b9926b5

Added

  • Implemented experimental APIs to allow computing FFTs on data distributed across multiple devices in a single process.

    rocfft_field is a new type that can be added to a plan description, to describe layout of FFT input or output. rocfft_field_add_brick can be called one or more times to describe a brick decomposition of an FFT field, where each brick can be assigned a different device.

    These interfaces are still experimental and subject to change. We are interested to hear feedback on them. Questions and concerns may be raised by opening issues on the rocFFT issue tracker.

    Note that at this time, multi-device FFTs have several limitations:

    • Real-complex (forward or inverse) FFTs are not currently supported.
    • Planar format fields are not currently supported.
    • Batch (i.e. number_of_transforms provided to rocfft_plan_create) must be 1.
    • The FFT input is gathered to the current device at execute time, so all of the FFT data must fit on that device.

    We expect these limitations to be removed in future releases.

Optimizations

  • Improved performance of some small 2D/3D real FFTs supported by 2D_SINGLE kernel. gfx90a gets more optimization
    by offline tuning.
  • Removed an extra kernel launch from even-length real-complex FFTs that use callbacks.

Changed

  • Built kernels in solution-map to library kernel cache.

  • Real forward transforms (real-to-complex) no longer overwrite input. rocFFT still may overwrite real inverse (complex-to-real) input, as this allows for faster performance.

  • rocfft-rider and dyna-rocfft-rider have been renamed to rocfft-bench and dyna-rocfft-bench, controlled by the
    BUILD_CLIENTS_BENCH CMake option. Links for the old file names are installed, and the old
    BUILD_CLIENTS_RIDER CMake option is accepted for compatibility but both will be removed in a future release.

  • Binaries in debug builds no longer have a "-d" suffix.

Fixed

  • rocFFT now correctly handles load callbacks that convert data from a smaller data type (e.g. 16-bit integers -> 32-bit float).