Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deleted Authors persisting in Solr index #861

Merged
merged 4 commits into from
Mar 15, 2018

Conversation

hornc
Copy link
Collaborator

@hornc hornc commented Mar 14, 2018

Closes #628
And may help with other Author search issues...

Description:

Turns out there was a bug where update_author() created its own string format delete messages for author deletes in Solr (for redirects, deletes, and blank author names). It was sending an incorrect key: OL...A instead of the key indexed in Solr: /authors/OL...A

This means that redirected or deleted authors were not being removed from search, and explains why the deletes appeared to be failing silently but not showing up in logs. There seems to be a way for the authors to be removed correctly if they were reindexed as a result of an edition or work change , which explains why it looked like some updates would work, but it was hard to pin down why. More investigation is needed, We should add some logging to alert if a Solr update does not affect any documents in Solr.

This was a whole lot easier to debug after #804 I was able to reproduce the issue in dev, and use tests to ensure the correct behaviour. Thanks @cdrini for the refactoring and docstrings, and @mekarpeles for the production logs! I have been chasing this issue for ages, its great to have it finally pinned down!

:param bool handle_redirects: If true, remove from SOLR all authors that redirect to this one
:rtype: list[string or UpdateRequest or DeleteRequest]
:param bool handle_redirects: If true, remove from Solr all authors that redirect to this one
:rtype: list[UpdateRequest or DeleteRequest]
"""
if akey == '/authors/':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we return in any case if not akey.startswith('/authors/OL')?

Copy link
Member

@mekarpeles mekarpeles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No deal breakers -- did have one suggestion which we can address as a followup (if you agree its a good change)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants