Open
Description
Env
- WSL 2
- Nvidia driver installed
- CUDA support installed by
pip install torch torchvison torchaudio
, which will installnvidia-cuda-xxx
as well. - llama-cpp-python build command:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server] --force-reinstall --upgrade --no-cache-dir
Problem Reproduce
Execute python -m llama_cpp.server --model yarn-mistral-7b-128k.Q5_K_M.gguf
with error:
CUDA error 100 at /tmp/pip-install-hjlvezud/llama-cpp-python_b986d017976f49d0bf4e93e3963398af/vendor/llama.cpp/ggml-cuda.cu:5823: no CUDA-capable device is detected
current device: 0
nvidia-smi
output
Mon Nov 6 23:21:17 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 536.23 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 On | 00000000:07:00.0 On | N/A |
| 30% 40C P8 13W / 170W | 1860MiB / 12288MiB | 11% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
CUDA verifies
with torch:
Python 3.8.18 (default, Sep 11 2023, 13:40:15)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.12.3 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import torch
In [2]: torch.cuda.is_available()
Out[2]: True
In [3]: torch.cuda.get_device_properties(0)
Out[3]: _CudaDeviceProperties(name='NVIDIA GeForce RTX 3060', major=8, minor=6, total_memory=12287MB, multi_processor_count=28)
In [4]:
with pip:
# pip list | grep cublas
nvidia-cublas-cu12 12.1.3.1