Open
Description
- By using command
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python setup.py bdist_wheel
, I can build out a wheel and have it installed as:
llama_cpp:
total 3.8M
-rwxrwxr-x 1 lvision lvision 46 Aug 7 22:59 __init__.py
-rwxrwxr-x 1 lvision lvision 3.7M Aug 7 22:59 libllama.so
-rwxrwxr-x 1 lvision lvision 43K Aug 7 22:59 llama_cpp.py
-rwxrwxr-x 1 lvision lvision 62K Aug 7 22:59 llama.py
-rwxrwxr-x 1 lvision lvision 2.1K Aug 7 22:59 llama_types.py
drwxrwxr-x 2 lvision lvision 4.0K Aug 7 22:59 __pycache__
drwxrwxr-x 3 lvision lvision 4.0K Aug 7 22:59 server
llama_cpp_python-0.1.77.dist-info:
total 36K
-rw-rw-r-- 1 lvision lvision 321 Aug 7 22:59 direct_url.json
-rw-rw-r-- 1 lvision lvision 4 Aug 7 22:59 INSTALLER
-rwxrwxr-x 1 lvision lvision 1.1K Aug 7 22:59 LICENSE.md
-rwxrwxr-x 1 lvision lvision 9.7K Aug 7 22:59 METADATA
-rw-rw-r-- 1 lvision lvision 2.2K Aug 7 22:59 RECORD
-rw-rw-r-- 1 lvision lvision 0 Aug 7 22:59 REQUESTED
-rwxrwxr-x 1 lvision lvision 10 Aug 7 22:59 top_level.txt
-rwxrwxr-x 1 lvision lvision 99 Aug 7 22:59 WHEEL
- By using the default command
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install -e .
, I'll build out the following, without llama_cpp.
llama_cpp_python-0.1.77.dist-info:
total 36K
-rw-rw-r-- 1 lvision lvision 104 Aug 7 23:02 direct_url.json
-rw-rw-r-- 1 lvision lvision 4 Aug 7 23:02 INSTALLER
-rwxrwxr-x 1 lvision lvision 1.1K Aug 7 23:02 LICENSE.md
-rw-rw-r-- 1 lvision lvision 9.7K Aug 7 23:02 METADATA
-rw-rw-r-- 1 lvision lvision 1022 Aug 7 23:02 RECORD
-rw-rw-r-- 1 lvision lvision 0 Aug 7 23:02 REQUESTED
-rw-rw-r-- 1 lvision lvision 10 Aug 7 23:02 top_level.txt
-rw-rw-r-- 1 lvision lvision 86 Aug 7 23:02 WHEEL
- When I tried to run localGPT, I got:
➜ localGPT git:(main) ✗ python run_localGPT.py --device_type cuda
......
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
Enter a query: Hello... How are you?
LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx
[3] 398690 IOT instruction (core dumped) python run_localGPT.py --device_type cuda
Unbelievable, it resorts to llama.cpp under the folder vendor. from where it's been built. LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx
.
Okay... Can anybody please tell me how to build llama-cpp-python from source and have it successfully installed in Release mode?