Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for tabular data entry in Bulk Search #9709

Conversation

benbdeitch
Copy link
Collaborator

@benbdeitch benbdeitch commented Aug 7, 2024

Closes #9699

This PR adds another new child of the AbstractExtractor class. The TableExtractor allows for data to be copy-pasted from Google Sheets or other spreadsheet services, and extracts the contents of the 'title' column and the 'author' column, if they exist.

Technical

Currently, the implementation allows for titles to be searched if the author column is empty. This has led to an odd interaction, where searching for 'Mark Twain' in the title ends up incorrectly pulling other books that he wrote. I'll be debugging that over the next day or so, but I figured I might as well put it up for review.

Testing

Create a table in Google Spreadsheets, then copy it into the input area of the /search/bulk page. Select the bottom extraction option, and click 'extract books'. Please ensure that you have titled your columns appropriately.

Screenshot

Stakeholders

@cdrini

@benbdeitch benbdeitch added the Type: Subtask of Epic A subtask that is part of the work breakdown of an epic issue (see comments). [managed] label Aug 7, 2024
@benbdeitch benbdeitch marked this pull request as draft August 7, 2024 22:02
@benbdeitch benbdeitch requested a review from cdrini August 7, 2024 22:02
@benbdeitch benbdeitch force-pushed the 9699/feature/add-tabular-support-to-bulk-search branch from f8affdd to 1bb951a Compare August 7, 2024 22:12
@codecov-commenter
Copy link

codecov-commenter commented Aug 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 16.45%. Comparing base (ce16a79) to head (b7538d8).
Report is 339 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9709      +/-   ##
==========================================
+ Coverage   16.06%   16.45%   +0.39%     
==========================================
  Files          90       91       +1     
  Lines        4769     4899     +130     
  Branches      832      853      +21     
==========================================
+ Hits          766      806      +40     
- Misses       3480     3559      +79     
- Partials      523      534      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cdrini cdrini marked this pull request as ready for review August 14, 2024 16:56
@benbdeitch benbdeitch force-pushed the 9699/feature/add-tabular-support-to-bulk-search branch from 1bb951a to 000d8a3 Compare August 14, 2024 17:26
@mekarpeles mekarpeles added the Priority: 1 Do this week, receiving emails, time sensitive, . [managed] label Aug 26, 2024
Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! A few small code tweaks!

openlibrary/components/BulkSearch/utils/classes.js Outdated Show resolved Hide resolved
openlibrary/components/BulkSearch/utils/classes.js Outdated Show resolved Hide resolved
openlibrary/components/BulkSearch/utils/classes.js Outdated Show resolved Hide resolved
openlibrary/components/BulkSearch/utils/classes.js Outdated Show resolved Hide resolved
@cdrini cdrini added the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Aug 28, 2024
@github-actions github-actions bot removed the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Aug 28, 2024
@cdrini cdrini added Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing labels Aug 30, 2024
@benbdeitch benbdeitch force-pushed the 9699/feature/add-tabular-support-to-bulk-search branch from 6f8d11e to d1066ac Compare August 30, 2024 21:13
@github-actions github-actions bot removed the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Aug 30, 2024
Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works like a charm!

@cdrini cdrini merged commit fa22a2e into internetarchive:master Sep 4, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Type: Subtask of Epic A subtask that is part of the work breakdown of an epic issue (see comments). [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for tabular input to BulkSearch
5 participants