Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow continuation in Instruct and Interact executors; fix a minor leak #852

Merged
merged 2 commits into from
Jul 15, 2024

Conversation

dpmm99
Copy link
Contributor

@dpmm99 dpmm99 commented Jul 12, 2024

The LLaVA path needs tested; I haven't used LLaVA in any capacity yet, myself. I tested both InstructExecutor and InteractiveExecutor in my actual applications by repeatedly pausing and continuing, and I found no cases where the model cut off a word/switched trains of thought when continuing, nor cases where it clearly misunderstood the next prompt after a continuation.

@SignalRT SignalRT self-assigned this Jul 14, 2024
@SignalRT
Copy link
Collaborator

@dpmm99, I checked the PR with llava, and it seems to work. My only question is that that the behavior changes when I switch the image.

Comparing the behavior with ollama when there is no prompt:

  • When tested with the first image the behavior is the same that ollama.
  • When tested with the second image the behavior seems to change and repeats the last answer, in the case of ollama the behavior is the same that with the first image.

Would we need to make some changes in the example LLavaInteractiveModeExecute.cs when we switch the images?

@dpmm99
Copy link
Contributor Author

dpmm99 commented Jul 14, 2024

Thanks for looking!

If you just interrupted generation in the InferAsync loop and then called InferAsync again with text: null, I'd expect it to continue the cut-off generation as if it hadn't been cut off. For example, the inference loop around line 107 could be changed to this to trivially test cancelling and continuing once:

var stopAfterWords = 5;
var cancellationToken = new CancellationTokenSource();
await foreach (var text in ex.InferAsync(prompt, inferenceParams, cancellationToken.Token))
{
    Console.Write(text);
    if (--stopAfterWords <= 0) cancellationToken.Cancel();
}
Console.Write("|"); //Just something to indicate that my call to Cancel happened
await foreach (var text in ex.InferAsync(null, inferenceParams))
{
    Console.Write(text);
}

It looks like that example just resets the KV cache for each new prompt if you give it a new image. I'm not sure if there's other state data that needs reset, but the example should probably call ex.GetStateData before the first prompt and ex.LoadState before each prompt and maybe even ex.Context.GetState and ex.Context.LoadState to reset it properly. If you just leave the prompt blank and hit enter again in this example, it won't trigger the new continuation behavior, because I maintained the original behavior for empty strings.

I just downloaded the LLaVA models and tested (5x) the above modification to that example to make it cancel and continue on its own, and it finishes the sentence after the display-only | like I'd expect.

Copy link
Collaborator

@SignalRT SignalRT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks god to me.

@martindevans
Copy link
Member

I resolved the merge conflict, once the CI has finished I'll merge this.

@martindevans martindevans merged commit 5025ad9 into SciSharp:master Jul 15, 2024
6 checks passed
@dpmm99 dpmm99 deleted the feat/continuation branch July 16, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants