Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: adapt function to translate cached prompts in place. #889

Merged

Conversation

HerrIvan
Copy link
Contributor

Address issue #888

It enables the following functionality in order to generate a testset in a language other than English.

from ragas.testset.prompts import testset_prompts
from ragas.testset.generator import TestsetGenerator

...
# some other imports 
...

# adapts all prompts to a given language, here german
for p in testset_prompts:
    p.adapt("german", translator_llm)

# run the test set generation
generator = TestsetGenerator(
        docstore=docstore,
        generator_llm=generator_llm,
        critic_llm=critic_llm,
        embeddings=embeddings
    )

dataset = generator.generate(
        test_size=TEST_SIZE,
        distributions=DISTRIBUTION
)

Without this fix, this would only work the first time. The first time, when the prompts are translated the "adapt" method changes the prompt instances (that is, in place). Therefore, the generate below uses the translated prompts. However once the prompts are in the cache, running adapt on all prompts doesn't change them "in place" and then generate will access the original prompts completely in English. This PR makes this approach to work also when the translated prompts are already cached.

I also added the testset_prompts variable for convenience.

Ivan Herreros and others added 2 commits April 21, 2024 16:12
@jjmachan
Copy link
Member

jjmachan commented Aug 2, 2024

thanks alot for the PR @HerrIvan ❤️

I know you pushed this a while back, really sorry for not merging it in earlier. but would love to catchup with you sometime if you are interested. Just to meet you 🙂

@jjmachan jjmachan merged commit 7db13d8 into explodinggradients:main Aug 2, 2024
1 check passed
Gwenn-LR pushed a commit to Gwenn-LR/ragas that referenced this pull request Aug 5, 2024
…inggradients#889)

Address issue explodinggradients#888

It enables the following functionality in order to generate a testset in
a language other than English.

```python

from ragas.testset.prompts import testset_prompts
from ragas.testset.generator import TestsetGenerator

...
# some other imports 
...

# adapts all prompts to a given language, here german
for p in testset_prompts:
    p.adapt("german", translator_llm)

# run the test set generation
generator = TestsetGenerator(
        docstore=docstore,
        generator_llm=generator_llm,
        critic_llm=critic_llm,
        embeddings=embeddings
    )

dataset = generator.generate(
        test_size=TEST_SIZE,
        distributions=DISTRIBUTION
)
```

Without this fix, this would only work the first time. The first time,
when the prompts are translated the "adapt" method changes the prompt
instances (that is, in place). Therefore, the generate below uses the
translated prompts. However once the prompts are in the cache, running
adapt on all prompts doesn't change them "in place" and then `generate`
will access the original prompts completely in English. This PR makes
this approach to work also when the translated prompts are already
cached.

I also added the `testset_prompts` variable for convenience.

Co-authored-by: Ivan Herreros <ivan.herreros-alonso@inovex.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants