Skip to content

Latest commit

 

History

History
57 lines (46 loc) · 1.75 KB

2024-03-20.md

File metadata and controls

57 lines (46 loc) · 1.75 KB

2024-03-20

Agenda

  1. Priority issues and pull requests
  2. Proposal for a Kernel Language WG
  3. Expanding runtime API (new features)

Notes

Proposal for a Kernel Language WG

  • A dedicated time to discuss topics will make sure this happens
  • Include new transpiler developers
  • Initial meeting to map out discussions, meeting frequency
  • Noel has tentatively volunteered to lead this WG

Potential Topics

  • Constant memory?
  • Collectives
  • Advanced instructions (e.g., fma's, tensor inst.)

Expanding Runtime API

  • Q: Should backend-specific functionality be supported? How?
  • Copy-to-symbol? (global/constant module scope variables)
    • Is this supported by most backend programming models?
  • Query kernel properties
  • Fill/memset (potentially outside of core scope, OCCA BLAS?)
  • Aligned alloc
    • Q: What is the default alignment for each backend?
  • Rectangular (2D/3D) memory allocation, copy
    • Potentially limited use cases?
  • Stream-ordered allocation
    • Convenience for programmer vs. performance improvement?
    • Potential use case: non-blocking streams
  • Suggested/auto @inner size (e.g., occupancy-based)
    • Would potentially lose benefit from launch-bounds
    • Maybe limited use cases--target is simple kernels where block size is less critical
  • Modules/kernel-bundles
  • Host-side callbacks
    • E.g., Coordinate with MPI calls?
  • Execution graphs
    • Need to gather use cases
  • Dynamic @shared sizes
    • E.g., to overcome limitation on size of static size shared memory
    • Potential pitfall for performance
  • Cache-configuration (L1-to-SHARED ratio)
    • Binary option
    • Causes device flush?
    • Potential for users to hurt performance if used unwisely
    • Highly likely to be backend specific

Next Meeting