-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic support for Custom Kernels. #28
Comments
from pyccel.decorators import kernel
@kernel
def increment_by_one(an_array):
# Thread id in a 1D block
tx = cuda.threadIdx(0)
# Block id in a 1D grid
ty = cuda.blockIdx(0)
# Block width, i.e. number of threads per block
bw = cuda.blockDim(0)
# Compute flattened index inside the array
pos = tx + ty * bw
if pos < an_array.size: # Check array boundaries
an_array[pos] += 1 Your code looks a little problematic to me. I would have expected code such as: from numba import cuda
from pyccel.decorators import kernel
@kernel
def increment_by_one(an_array):
# Thread id in a 1D block
tx = cuda.threadIdx.x
# Block id in a 1D grid
ty = cuda.blockIdx.x
# Block width, i.e. number of threads per block
bw = cuda.blockDim.x
# Compute flattened index inside the array
pos = tx + ty * bw
if pos < an_array.size: # Check array boundaries
an_array[pos] += 1 Would the latter run in pure Python at all ? |
from pyccel import cuda
from pyccel.decorators import kernel
@kernel
def increment_by_one(an_array):
# Thread id in a 1D block
tx = cuda.threadIdx(0)
# Block id in a 1D grid
ty = cuda.blockIdx(0)
# Block width, i.e. number of threads per block
bw = cuda.blockDim(0)
# Compute flattened index inside the array
pos = tx + ty * bw
if pos < an_array.size: # Check array boundaries
an_array[pos] += 1 sorry I missed a |
In that case shouldn't it be |
yeah that would be better 👍 will change it. |
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
This issue aims to add the feature of creating Custom Kernels in the Numba style.
below you can find an example of a
kernel
definition which can be called in the code in the following format.increment_by_one[BN, TPB](args)
:BN
: is the number of blocks to be dispatched on the GPU.TPB
: is the number of threads on each block.this can be implemented by checking if
IndexedElement
in thesemantic
stage is aFunctionCall
and replace it in the AST with aKernelCall
node.a
KernelCall
can be detected if anIndexedElement
contains aFunctionCall
which is decorated by thekernel
decorator.Numba code:
Pyccel code:
The text was updated successfully, but these errors were encountered: