run openblas docker, when token  exceed 50, will crash

i use dockfile of openblas_simple ,  i  build the docker image  llamapython-cpu  ,and  run it
command is :
sudo docker run  --rm -it -p 8000:8000 -v /home/cd/ai/baichuan/baichuan-ggml:/models -e MODEL=/models/ggml-model-q4_0.bin llamapython-cpu

![image](https://github.com/abetlen/llama-cpp-python/assets/102452590/d3db7222-936e-4252-b1b7-86b6658fdc16)


i use postman to post  a request， it can run fine， but  when send  a message token>50, it will throw  error and stop：
![image](https://github.com/abetlen/llama-cpp-python/assets/102452590/96a60f85-71c8-4fd6-b2fd-119592a86103)

I have 64g of ram, I confirmed that I have enough ram, so it's definitely not a problem of insufficient memory, and when I don't use openblas, there is no such problem.

i don't know why,  thanks  for your good job!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

run openblas docker, when token exceed 50, will crash #451

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

run openblas docker, when token exceed 50, will crash #451

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions