-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk repetition #471
Comments
Hi, thanks for the detailed steps - this helps a lot. After debugging with WHISPER_DEBUG enabled I can see immediately that in this case, the entropy-based check for repetition didn't trigger. The cost function was just slightly above the default entropy threshold of
This means that the decoder didn't "detect" that there is a repetition and therefore didn't use the fallback strategy to correct it. Rerunning the transcription with a slightly increased entropy threshold of Obviously, this is not a very nice approach since there is no way to normally see this debug information. But that is the general problem of having this kind of free parameters. The default values are not always going to work and might need a little tuning in some cases. I'll try to think of some more robust way to detect the repetitions. |
That entropy threshold did the trick for that episode. Thanks! |
I disabled this because there were many complaints about slow decoding. The current implementation does not allow batching the decoders when using the "best of" or "beam size" parameters, so the decoding time is proportional to the number of decoders, which is obviously not great. However, now there are even more complaints about wrong decodings and repetition. So, making a compromise by re-enabling the fallbacks, but defaulting to just 2 "best of" / "beam size" decoders. Also, the temperature step is increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum of 2. Also, the stream example now has fallbacks enabled by default. close ggerganov#471 ggerganov#477 ggerganov#508 ggerganov#612 ggerganov#719 ggerganov#731
I disabled this because there were many complaints about slow decoding. The current implementation does not allow batching the decoders when using the "best of" or "beam size" parameters, so the decoding time is proportional to the number of decoders, which is obviously not great. However, now there are even more complaints about wrong decodings and repetition. So, making a compromise by re-enabling the fallbacks, but defaulting to just 2 "best of" / "beam size" decoders. Also, the temperature step is increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum of 2. Also, the stream example now has fallbacks enabled by default. close ggerganov#471 ggerganov#477 ggerganov#508 ggerganov#612 ggerganov#719 ggerganov#731
I disabled this because there were many complaints about slow decoding. The current implementation does not allow batching the decoders when using the "best of" or "beam size" parameters, so the decoding time is proportional to the number of decoders, which is obviously not great. However, now there are even more complaints about wrong decodings and repetition. So, making a compromise by re-enabling the fallbacks, but defaulting to just 2 "best of" / "beam size" decoders. Also, the temperature step is increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum of 2. Also, the stream example now has fallbacks enabled by default. close ggerganov#471 ggerganov#477 ggerganov#508 ggerganov#612 ggerganov#719 ggerganov#731
I'm not sure if this is a variant of #412, but check out this partial output:
Reproduction:
Standard error:
I'm on the main branch at v1.2.0.
The text was updated successfully, but these errors were encountered: