Add ninja to dependency #21

WoosukKwon · 2023-04-02T01:59:32Z

The compilation time of flash-attn can be drastically reduced if ninja is installed. Related issue: Dao-AILab/flash-attention#150

…ock_size [CPU] Support for larger block_size

Fix more logging lint errors

Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Daniel Clark <daniel.clark@ibm.com>

Add ninja to dependency

86983b7

WoosukKwon merged commit 2c5cd0d into main Apr 2, 2023

WoosukKwon deleted the ninja branch April 2, 2023 02:00

shanshanpt mentioned this pull request Nov 17, 2023

Run long conetxt error : CUDA error: an illegal memory access was encountered #1700

Closed

junior-zsy mentioned this pull request Nov 20, 2023

Error with 32k Long Text in chatglm2-6b-32k Model #1725

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Add ninja to dependency (vllm-project#21)

f4354de

slyalin pushed a commit to slyalin/vllm that referenced this pull request Apr 3, 2024

Merge pull request vllm-project#21 from luo-cheng2021/luocheng/var_bl…

ee5c232

…ock_size [CPU] Support for larger block_size

tdg5 pushed a commit to tdg5/vllm that referenced this pull request Apr 25, 2024

Merge pull request vllm-project#21 from tdg5/exp-2

36cf873

Fix more logging lint errors

z103cb referenced this pull request in z103cb/opendatahub_vllm May 7, 2024

fix: Missed TLS config logic from internal fork (opendatahub-io#21)

7df0eb8

Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Daniel Clark <daniel.clark@ibm.com>

yuhuixu1993 mentioned this pull request Jun 2, 2024

[Bug]: loading squeezellm model #5190

Open

alixiaodi mentioned this pull request Aug 2, 2024

[Bug]: #7072

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ninja to dependency #21

Add ninja to dependency #21

WoosukKwon commented Apr 2, 2023

Add ninja to dependency #21

Add ninja to dependency #21

Conversation

WoosukKwon commented Apr 2, 2023