Skip to content

run openblas docker, when token exceed 50, will crash #451

Open
@abc20220327

Description

@abc20220327

i use dockfile of openblas_simple , i build the docker image llamapython-cpu ,and run it
command is :
sudo docker run --rm -it -p 8000:8000 -v /home/cd/ai/baichuan/baichuan-ggml:/models -e MODEL=/models/ggml-model-q4_0.bin llamapython-cpu

image

i use postman to post a request, it can run fine, but when send a message token>50, it will throw error and stop:
image

I have 64g of ram, I confirmed that I have enough ram, so it's definitely not a problem of insufficient memory, and when I don't use openblas, there is no such problem.

i don't know why, thanks for your good job!

Metadata

Metadata

Assignees

No one assigned

    Labels

    llama.cppProblem with llama.cpp shared lib

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions