Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix Summarization Score #1164

Merged
merged 9 commits into from
Aug 10, 2024

Conversation

emreds
Copy link
Contributor

@emreds emreds commented Aug 4, 2024

Hi!

I was experimenting with Summarization Score and found an issue that when there are no questions generated by the LLM, it throws division by zero error due to 0 generated answers.

Notable Changes

  • Added an assertion to keyphrases, questions and answers generation functions.
  • Changed the llm response variable name answer to response to prevent possible confusion with score related answer.

Please let me know if there is something I can add.

Cheers!

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Aug 4, 2024
Copy link
Member

@jjmachan jjmachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emreds thanks a lot for pushing this PR and helping us Improve Ragas. I was however not sure if an assertion here is the best flow.

it will break the user code right if the assertions fail? Ideally the behaviour we want is to use logger.error() to print out an explainable error and then return a Nan value (or throw and exception since executor should catch it).

what do you think?

@emreds
Copy link
Contributor Author

emreds commented Aug 6, 2024

@jjmachan sure, makes more sense. I will update it.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Aug 6, 2024
Copy link
Member

@jjmachan jjmachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

math.nan to np.nan.

I've changed one, do change the others too?

just to keep things consistent throughout the codebase 🙂

src/ragas/metrics/_summarization.py Outdated Show resolved Hide resolved
Copy link
Member

@jjmachan jjmachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet! thanks a lot @emreds ❤️

@emreds emreds requested a review from jjmachan August 6, 2024 08:35
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Aug 7, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Aug 7, 2024
@jjmachan
Copy link
Member

jjmachan commented Aug 7, 2024

@emreds I made some small changes for fixing the type error but now it still returns an empty list. This is so that the types make sense. but I'm not sure if it logically correct

@emreds
Copy link
Contributor Author

emreds commented Aug 8, 2024

I think it looks good, as you have mentioned it doesn't break the code but throws the error to understand the context.

@jjmachan jjmachan merged commit b7c1bdc into explodinggradients:main Aug 10, 2024
16 checks passed
shahules786 pushed a commit to shahules786/ragas that referenced this pull request Aug 13, 2024
Hi! 

I was experimenting with Summarization Score and found an issue that
when there are no questions generated by the LLM, it throws division by
zero error due to 0 generated answers.

## Notable Changes
- Added an assertion to keyphrases, questions and answers generation
functions.
- Changed the llm response variable name `answer` to `response` to
prevent possible confusion with score related `answer`.

Please let me know if there is something I can add. 

Cheers!

---------

Co-authored-by: jjmachan <jamesjithin97@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants