Added mistral instruct chat format as "mistral-instruct" #799

Rafaelblsilva · 2023-10-07T13:05:43Z

@abetlen Thanks for you work!

Here is my small contribution.

alex4321 · 2023-10-13T07:51:16Z

Did you checked - does it (llama.cpp nowadays tokenizer) even tokenize text correctly?

Because from what I found at https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF - prompt format is:

<s>[INST] {prompt} [/INST]

And inside Llama::create_chat_completion prompt converted to a string before passing to the tokenizer into Llama:_create_completion:

        prompt_tokens: List[int] = (
            self.tokenize(prompt.encode("utf-8"))
            if prompt != ""
            else [self.token_bos()]
        )

While I tried the following code:

for token in model.tokenize("<s>[INST] Test prompt [/INST]".encode(), add_bos=False):
    print(model.detokenize([token]))

And the output was:

b' <'
b's'
b'>'
b'['
b'INST'
b']'
b' Test'
b' prompt'
b' ['
b'/'
b'INST'
b']'

I guess <s>, [INST] and [/INST] should be a special tokens?
However, I did not dive into the issue much yet and when the tokenization will work it will be useful.

So I am asking just in case I missed something and this stuff should work already.

@Rafaelblsilva

Rafaelblsilva · 2023-10-13T17:14:54Z

@alex4321

I have not checked that deep.
If i understood correctly your observation suggest there might be underlying issues even with llama-2 tokenization?

Is this model dependent, or something that would need to be implemented on llama.cpp?

I've quickly looked on open issues there and found these:

ggerganov/llama.cpp#3475
ggerganov/llama.cpp#2820

But at least on mistral model with my changes it works a bit better than just using the llama-2 prompt.

fakerybakery · 2023-11-26T01:49:01Z

Hi can we merge this?

zhangp365 · 2023-11-26T06:56:32Z

Based on this document, I believe this pull request is accurate for a single round. However, according to the document, for multiple rounds of history, each round should conclude with '< /s >'.

fakerybakery · 2023-12-13T00:10:24Z

Hi @abetlen might it be possible to merge this, now that Mistral Instruct v0.2 has been released?

handshape · 2023-12-14T15:44:28Z

I'm not certain that the models are consistently emitting </s> as a single token. I'm hand-cranking the sampling/detokenization loop, and seeing that about 80% of the time, the eos token is coming out as a single token, and the other 20% of the time it's coming out as a sequence of four single-character tokens. I'm having to catch all permutations.

abetlen · 2024-01-29T05:31:03Z

Hey @Rafaelblsilva sorry for missing this, for some reason I thought it overlapped with another format that was already supported. Thank you for the contribution!

Note: I renamed the chat format to mistral-instruct so it's consistent with the finetuned model name.

Added mistral instruct chat format as "mistral"

0320de0

abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24

abetlen added 2 commits January 29, 2024 00:28

Merge branch 'main' into feature/mistral_chat_format

0f00486

Fix stop sequence (merge issue)

643d031

abetlen changed the title ~~Added mistral instruct chat format as "mistral"~~ Added mistral instruct chat format as "mistral-instruct" Jan 29, 2024

Update chat format name to mistral-instruct

b3b6c19

abetlen merged commit ce38dbd into abetlen:main Jan 29, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added mistral instruct chat format as "mistral-instruct" #799

Added mistral instruct chat format as "mistral-instruct" #799

Rafaelblsilva commented Oct 7, 2023

alex4321 commented Oct 13, 2023

Rafaelblsilva commented Oct 13, 2023

fakerybakery commented Nov 26, 2023

zhangp365 commented Nov 26, 2023 •

edited

Loading

fakerybakery commented Dec 13, 2023

handshape commented Dec 14, 2023 •

edited

Loading

abetlen commented Jan 29, 2024 •

edited

Loading

Added mistral instruct chat format as "mistral-instruct" #799

Added mistral instruct chat format as "mistral-instruct" #799

Conversation

Rafaelblsilva commented Oct 7, 2023

alex4321 commented Oct 13, 2023

Rafaelblsilva commented Oct 13, 2023

fakerybakery commented Nov 26, 2023

zhangp365 commented Nov 26, 2023 • edited Loading

fakerybakery commented Dec 13, 2023

handshape commented Dec 14, 2023 • edited Loading

abetlen commented Jan 29, 2024 • edited Loading

zhangp365 commented Nov 26, 2023 •

edited

Loading

handshape commented Dec 14, 2023 •

edited

Loading

abetlen commented Jan 29, 2024 •

edited

Loading