Raise error if encountered in chat completion SSE stream #2558

Wauplin · 2024-09-20T15:22:19Z

When stream=True is passed, the server response can be HTTP 200 but still get an error as a stream event. We were already parsing them for text_generation task, but not chat_completion. This PR fixes this.

Example

from huggingface_hub import InferenceClient


client = InferenceClient("microsoft/Phi-3-mini-4k-instruct")

for message in client.chat_completion(
    messages=[{"role": "user", "content": "Hello there !"}],
    stream=True,
    max_tokens=4091,  # values lower or equal to 4090 work
):
    print(message.choices[0].delta.content, end="")

Traceback (most recent call last):
  File "/home/wauplin/projects/huggingface_hub/idefi.py", line 6, in <module>
    for message in client.chat_completion(
  File "/home/wauplin/projects/huggingface_hub/src/huggingface_hub/inference/_common.py", line 321, in _stream_chat_completion_response
    output = _format_chat_completion_stream_output(item)
  File "/home/wauplin/projects/huggingface_hub/src/huggingface_hub/inference/_common.py", line 356, in _format_chat_completion_stream_output
    raise _parse_text_generation_error(json_payload["error"], json_payload.get("error_type"))
huggingface_hub.errors.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 6 `inputs` tokens and 4091 `max_new_tokens`

HuggingFaceDocBuilderDev · 2024-09-20T15:25:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

hanouticelina

LGTM 👍

Wauplin · 2024-09-20T16:04:41Z

Thanks for the review!

Raise error if encountered in chat completion SSE stream

e070a26

Wauplin requested review from LysandreJik and hanouticelina September 20, 2024 15:22

hanouticelina approved these changes Sep 20, 2024

View reviewed changes

Wauplin merged commit c0fd4e0 into main Sep 20, 2024
19 checks passed

Wauplin deleted the 2514-raise-error-if-SSE-error branch September 20, 2024 16:04

hanouticelina pushed a commit that referenced this pull request Sep 23, 2024

Raise error if encountered in chat completion SSE stream (#2558)

efafba2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise error if encountered in chat completion SSE stream #2558

Raise error if encountered in chat completion SSE stream #2558

Wauplin commented Sep 20, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 20, 2024

hanouticelina left a comment

Wauplin commented Sep 20, 2024

Raise error if encountered in chat completion SSE stream #2558

Raise error if encountered in chat completion SSE stream #2558

Conversation

Wauplin commented Sep 20, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Sep 20, 2024

hanouticelina left a comment

Choose a reason for hiding this comment

Wauplin commented Sep 20, 2024

Wauplin commented Sep 20, 2024 •

edited

Loading