Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search in PDF Files #2838

Merged
merged 122 commits into from
Jul 14, 2021
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
122 commits
Select commit Hold shift + click to select a range
321a0f0
Add lucene to gradle.build dependencies
Braunch Aug 2, 2016
2580757
Implement first attempt on a lucene powerded indexed search machine f…
Braunch Aug 2, 2016
a9d85dc
Ignore test and improve search handler and content reader to map bibt…
Braunch Aug 4, 2016
753c25d
Create search result wrapper classes for better readability of future…
Braunch Aug 4, 2016
6d1cb44
Add lucene score and collector for search meta data retrieval in the …
Braunch Aug 4, 2016
1af3f1a
Add example pdf and create test skeleton
Braunch Aug 4, 2016
40609ad
Merge branch 'PDFSearchFeature' of github.com:Braunch/jabref into lucene
LinusDietz Apr 20, 2017
8787bcd
import fulltext search from Braunch:PDFSearchFeature
LinusDietz Apr 20, 2017
d51919e
Merge branch 'master' of github.com:JabRef/jabref into lucene
LinusDietz May 10, 2017
692316b
Refactored the PDFContentReader
LinusDietz May 10, 2017
dacbc3c
Put DocumentReader under test
LinusDietz May 10, 2017
2663798
Use current Lucene Release, Indexing is working now.
LinusDietz May 10, 2017
7086a8f
Revised package structure
LinusDietz May 12, 2017
444bd39
added stem analysis
LinusDietz May 12, 2017
8547873
Implemented Searcher Test
LinusDietz May 12, 2017
c9cf893
Improve Naming
LinusDietz May 12, 2017
9eaa390
fix build
LinusDietz May 12, 2017
c83255b
First round of Feedback
LinusDietz May 14, 2017
25cdaff
Second round of Feedback
LinusDietz May 14, 2017
b7bfc9a
Removed PDF Creator from the Search index, add the UID to the search …
LinusDietz May 14, 2017
9a71cac
fixed failing test
LinusDietz May 15, 2017
aa76b72
Merge branch 'master' of https://github.com/JabRef/jabref into lucene
LinusDietz Aug 29, 2017
aed867d
update Lucene from 6.5.1 -> 6.6.0
LinusDietz Aug 29, 2017
a9a9486
integrate indexing into Jabref
LinusDietz Aug 29, 2017
b32af3c
Merge branch 'master' of https://github.com/JabRef/jabref into lucene
LinusDietz Oct 4, 2017
e64c3da
Merge branch 'master' of https://github.com/JabRef/jabref into lucene
LinusDietz Dec 20, 2017
9cd3df3
Update lucene from 6.6.0 -> 7.1.0
LinusDietz Dec 20, 2017
a12555a
Merge branch 'master' into lucene
LinusDietz Jan 23, 2018
e2662cd
Merge branch 'master' of https://github.com/JabRef/jabref into lucene
LinusDietz Jan 23, 2018
650c050
Merge branch 'lucene' of https://github.com/JabRef/jabref into lucene
LinusDietz Jan 23, 2018
a6fad30
Update lucene from 7.1.0 -> 7.2.1
LinusDietz Jan 23, 2018
74c90bd
Merge branch lucene of https://github.com/JabRef/jabref into lucene
btut Jun 16, 2021
6d9ba0b
Lucene dependencies
btut Jun 16, 2021
1a53c47
First shot for integrating lucene
koppor Jun 16, 2021
71a9cb0
Fix and document dependencies
btut Jun 18, 2021
4125ee8
Update to lucene 8.8.2
btut Jun 18, 2021
dd698ff
Checkstyle
btut Jun 18, 2021
e247540
Added fulltext-search button to GlobalSearchBar
btut Jun 18, 2021
63c2abf
Fixed typo
btut Jun 18, 2021
f6e2058
Update to lucene 8.9.0
btut Jun 21, 2021
3ba53be
Added update/remove from index for individual entries/files
btut Jun 22, 2021
4f5c8ed
Started integrating indexer
btut Jun 23, 2021
596525f
Ignore lucene index in git
btut Jun 23, 2021
516d1ca
Fixed tests
btut Jun 23, 2021
4de4dbe
Checkstyle
btut Jun 23, 2021
d83ae36
First draft of search-results tab
btut Jun 23, 2021
a32bec7
Merge branch 'main' of github.com:JabRef/jabref into lucene
btut Jun 28, 2021
addf51e
No tabs in build.gradle
btut Jun 29, 2021
4a3d3c8
Code cleanup
btut Jun 29, 2021
cb30b3b
Added highlighting dependency
btut Jun 29, 2021
8c9bf3c
Highlighted search results
btut Jun 29, 2021
651ef23
Added lib to access local app-data path
btut Jul 5, 2021
7804060
SearchRules only for single DatabaseContext
btut Jul 5, 2021
2c7dc25
Before reverting
btut Jul 5, 2021
148b4f6
Revert "SearchRules only for single DatabaseContext"
btut Jul 5, 2021
edc6fe4
Better SearchRules only for single DatabaseContext
btut Jul 5, 2021
2557fee
Fixed document duplication on update
btut Jul 5, 2021
b4a9c7a
Only write files if index is out of date
btut Jul 5, 2021
786828d
Use globals instead of passing BibDatabaseContext
btut Jul 6, 2021
25c3a40
Fixed tests
btut Jul 7, 2021
4646ab4
Checkstyle
btut Jul 7, 2021
67a7d0a
Ignore index created during tests in git
btut Jul 7, 2021
90ee454
Removed unused localization key
btut Jul 7, 2021
181e382
Listen for changes that concern the index
btut Jul 7, 2021
86d73d0
Added menu-item to rebuild index
btut Jul 7, 2021
a8781ae
Allow SearchRules to access Globals
btut Jul 8, 2021
85beb5c
Moved access to Globals to constructor
btut Jul 8, 2021
c7f3bc8
Removed most metadata from index
btut Jul 8, 2021
e511cbd
Do indexing per Library-tab
btut Jul 8, 2021
b61e14f
Removed tests for Metadata in DocumentReader
btut Jul 8, 2021
ebd9313
Consider cases where there is no open database
btut Jul 8, 2021
cffe5b5
Actually consider fulltext results in search predicate
btut Jul 8, 2021
4f546d1
Applied theme to search-results tab
btut Jul 8, 2021
8c3e76c
Files can be opened from the search-results tab
btut Jul 8, 2021
85a048b
Merge branch 'main' of github.com:JabRef/jabref into lucene
btut Jul 8, 2021
030d5b3
Fixed merge-artifact
btut Jul 8, 2021
acea590
Fixed file-type filter in indexer
btut Jul 8, 2021
18aac31
Changelog entry
btut Jul 8, 2021
6d40f7a
Fixed file types in tests
btut Jul 8, 2021
da3ed1e
Checkstyle
btut Jul 8, 2021
99decc1
Fixed benchmarks
btut Jul 8, 2021
5bde996
Removed spaces in CHANGELOG
btut Jul 8, 2021
a4cd102
Removed unecessary code in shadow-jar for lucene
btut Jul 9, 2021
856e20f
Removed weird endless recursion
btut Jul 9, 2021
33e6c4a
Fixed Typo in comment
btut Jul 9, 2021
27f201d
Use parseLong instead of valueOf
btut Jul 9, 2021
cfc5869
Rescoped BibDatabase variable
btut Jul 9, 2021
effbfdc
Use method instance
btut Jul 9, 2021
2e5818b
Remove unecessary return
btut Jul 9, 2021
adb195e
Remove unnecessary throws declaration
btut Jul 9, 2021
9e10107
Cleaner sort-predicate
btut Jul 9, 2021
8a8ac9c
Limit number of search result to 5
btut Jul 10, 2021
1bf636a
Apply suggestions from code review
btut Jul 10, 2021
989e266
Fixed logger formating
btut Jul 10, 2021
5f093d7
Add log for index-location
btut Jul 10, 2021
298a0d9
Log IO exception when rebuilding index from menu
btut Jul 10, 2021
2d711d4
Localized search results tab
btut Jul 10, 2021
2b879fa
Moved logging to slf4j
btut Jul 10, 2021
8afa013
Log search-result exception
btut Jul 10, 2021
e968483
Better naming for task queue pointer
btut Jul 10, 2021
63eb1a1
Replaced deprecated classes and methods
btut Jul 10, 2021
b500ba2
Removed unnecessary wrapping of unmodifyable list
btut Jul 10, 2021
a554a36
Simplify iteration over search results
btut Jul 10, 2021
36099a8
Use TempDir to store the index during tests
btut Jul 10, 2021
44c975f
Use EnumSet for flags instead of booleans
btut Jul 10, 2021
cb1c393
Delete out-of-date indices
btut Jul 10, 2021
99e723e
Fixed import order
btut Jul 10, 2021
82b9d69
Task-queue and better message for indexing task
btut Jul 10, 2021
cec2089
Remove empty line
koppor Jul 13, 2021
e69219b
Update src/main/java/org/jabref/gui/JabRefMain.java
koppor Jul 13, 2021
533331a
Merge branch 'main' into lucene
koppor Jul 13, 2021
bfd2582
Use literal JabRef for index path
btut Jul 14, 2021
46517ff
Apply suggestions from @koppor
btut Jul 14, 2021
9c887a3
Checkstyle
btut Jul 14, 2021
dc9f53d
Removed dead code
btut Jul 14, 2021
7601ce6
Removed more dead code
btut Jul 14, 2021
5dc3852
Add annotations to index
btut Jul 14, 2021
110601c
Checkstyle
btut Jul 14, 2021
964aa1f
Error handling when reading annotations
btut Jul 14, 2021
25d5733
Remove hardcoded appdata path
btut Jul 14, 2021
79ae32b
Refine toString
koppor Jul 14, 2021
e6055fc
Rename method
koppor Jul 14, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,12 @@ dependencies {
compile group: 'com.microsoft.azure', name: 'applicationinsights-core', version: '1.0.+'
compile group: 'com.microsoft.azure', name: 'applicationinsights-logging-log4j2', version: '1.0.+'

compile 'org.apache.lucene:lucene-core:6.5.1'
btut marked this conversation as resolved.
Show resolved Hide resolved
compile 'org.apache.lucene:lucene-queryparser:6.5.1'
compile 'org.apache.lucene:lucene-queries:6.5.1'
compile 'org.apache.lucene:lucene-analyzers-common:6.5.1'


testCompile 'junit:junit:4.12'
testCompile 'org.mockito:mockito-core:2.7.22'
testCompile 'com.github.tomakehurst:wiremock:2.6.0'
Expand Down Expand Up @@ -308,26 +314,26 @@ shadowJar {

// this is an adapter required for generating a fat jar with correct log4j2 output

com.github.edwgiz.mavenShadePlugin.log4j2CacheTransformer.PluginsCacheFileTransformer target = new com.github.edwgiz.mavenShadePlugin.log4j2CacheTransformer.PluginsCacheFileTransformer();
com.github.edwgiz.mavenShadePlugin.log4j2CacheTransformer.PluginsCacheFileTransformer target = new com.github.edwgiz.mavenShadePlugin.log4j2CacheTransformer.PluginsCacheFileTransformer()

@Override
boolean canTransformResource(FileTreeElement element) {
return target.canTransformResource(element.getPath());
btut marked this conversation as resolved.
Show resolved Hide resolved
return target.canTransformResource(element.getPath())
}

@Override
void transform(String path, InputStream is, List<com.github.jengelman.gradle.plugins.shadow.relocation.Relocator> relocators) {
target.processResource(path, is, relocators);
target.processResource(path, is, relocators)
}

@Override
boolean hasTransformedResource() {
return target.hasTransformedResource();
return target.hasTransformedResource()
}

@Override
void modifyOutputStream(org.apache.tools.zip.ZipOutputStream jos) {
target.modifyOutputStream(jos);
target.modifyOutputStream(jos)
}
})
}
Expand Down Expand Up @@ -356,7 +362,7 @@ if (hasProperty('dev')) {
// In the context of github, the branch name could be something like "pull/277"
// "/" is an illegal character. To be safe, all illegal filename characters are replaced by "_"
// http://stackoverflow.com/a/15075907/873282 describes the used pattern.
branchName = branchName.trim().replaceAll("[^a-zA-Z0-9.-]", "_");
branchName = branchName.trim().replaceAll("[^a-zA-Z0-9.-]", "_")

// hack string
// first the date (%cd), then the branch name, and finally the commit id (%h)
Expand Down
4 changes: 2 additions & 2 deletions gradle/wrapper/gradle-wrapper.properties
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#Thu Apr 13 18:56:36 CEST 2017
#Fri May 12 13:26:57 CEST 2017
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-3.5-bin.zip
distributionUrl=https\://services.gradle.org/distributions/gradle-3.5-all.zip
btut marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
package org.jabref.logic.pdf.search.indexing;

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;

import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.FieldName;

import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.util.PDFTextStripper;

import static org.jabref.model.pdf.search.SearchFieldConstants.AUTHOR;
import static org.jabref.model.pdf.search.SearchFieldConstants.CONTENT;
import static org.jabref.model.pdf.search.SearchFieldConstants.CREATOR;
import static org.jabref.model.pdf.search.SearchFieldConstants.KEY;
import static org.jabref.model.pdf.search.SearchFieldConstants.KEYWORDS;
import static org.jabref.model.pdf.search.SearchFieldConstants.SUBJECT;

public final class DocumentReader {

private final BibEntry entry;
private final PDFTextStripper pdfTextStripper = new PDFTextStripper();

public DocumentReader(BibEntry bibEntry) throws IOException {
if (!bibEntry.getField(FieldName.FILE).isPresent()) {
btut marked this conversation as resolved.
Show resolved Hide resolved
throw new IllegalArgumentException("The file field must not be absent when trying to reading the document!");
}

this.entry = bibEntry;
pdfTextStripper.setLineSeparator("\n");
}

/**
* Reads the content and metadata from a pdf file
*/
public Document readPdfContents() throws IOException {
Path pdfPath = Paths.get(this.entry.getField(FieldName.FILE).get());
btut marked this conversation as resolved.
Show resolved Hide resolved

try (PDDocument pdfDocument = PDDocument.load(pdfPath.toFile())) {
Document newDocument = new Document();
addKeyIfPresent(newDocument);
addContentIfNotEmpty(pdfDocument, newDocument);
addMetaData(pdfDocument, newDocument);
return newDocument;
} catch (IOException e) {
throw new IOException("Could not read pdf file: " + pdfPath + "!", e);
}
}

private void addMetaData(PDDocument pdfDocument, Document newDocument) {
PDDocumentInformation info = pdfDocument.getDocumentInformation();
addStringField(newDocument, AUTHOR, info.getAuthor());
btut marked this conversation as resolved.
Show resolved Hide resolved
addStringField(newDocument, CREATOR, info.getCreator());
addStringField(newDocument, SUBJECT, info.getSubject());
addTextField(newDocument, KEYWORDS, info.getKeywords());
}

private void addTextField(Document newDocument, String field, String value) {
if (!isValidField(value)) {
return;
}
newDocument.add(new TextField(field, value, Field.Store.YES));
}

private void addStringField(Document newDocument, String field, String value) {
if (!isValidField(value)) {
return;
}
newDocument.add(new StringField(field, value, Field.Store.YES));
}

private boolean isValidField(String value) {
return !(value == null || value.trim().isEmpty());
}

private void addContentIfNotEmpty(PDDocument pdfDocument, Document newDocument) throws IOException {
String pdfContent = pdfTextStripper.getText(pdfDocument);
if (!pdfContent.trim().isEmpty()) {
btut marked this conversation as resolved.
Show resolved Hide resolved
newDocument.add(new TextField(CONTENT, pdfContent, Field.Store.YES));
}
}

private void addKeyIfPresent(Document newDocument) {
if (this.entry.getCiteKeyOptional().isPresent()) {
newDocument.add(new StringField(KEY, this.entry.getCiteKeyOptional().get(), Field.Store.YES));
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package org.jabref.logic.pdf.search.indexing;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.LowerCaseFilter;
import org.apache.lucene.analysis.StopFilter;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.Tokenizer;
import org.apache.lucene.analysis.core.DecimalDigitFilter;
import org.apache.lucene.analysis.core.StopAnalyzer;
import org.apache.lucene.analysis.en.PorterStemFilter;
import org.apache.lucene.analysis.standard.StandardFilter;
import org.apache.lucene.analysis.standard.StandardTokenizer;

public class EnglishStemAnalyzer extends Analyzer {

@Override
protected TokenStreamComponents createComponents(String fieldName) {
Tokenizer source = new StandardTokenizer();
TokenStream filter = new StandardFilter(source);
filter = new LowerCaseFilter(filter);
filter = new StopFilter(filter, StopAnalyzer.ENGLISH_STOP_WORDS_SET);
filter = new DecimalDigitFilter(filter);
filter = new PorterStemFilter(filter);
return new TokenStreamComponents(source, filter);
}
}

90 changes: 90 additions & 0 deletions src/main/java/org/jabref/logic/pdf/search/indexing/PdfIndexer.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
package org.jabref.logic.pdf.search.indexing;

import java.io.IOException;
import java.nio.file.Paths;

import org.jabref.model.database.BibDatabase;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.FieldName;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.SimpleFSDirectory;


/**
* Indexes the text of pdf files and adds it into the lucene index.
*/
public class PdfIndexer {
private static final Log LOGGER = LogFactory.getLog(PdfIndexer.class);

private final Directory directoryToIndex;

public PdfIndexer() throws IOException {
this.directoryToIndex = new SimpleFSDirectory(Paths.get("src/main/resources/luceneIndex"));
}

public Directory getIndexDirectory() {
return this.directoryToIndex;
}

/**
* Adds all PDF files linked to an entry in the database to new Lucene search index
*
* @param database a bibtex database to link the pdf files to
*/
public void createIndex(BibDatabase database) {
try (IndexWriter indexWriter = new IndexWriter(directoryToIndex,
new IndexWriterConfig(new EnglishStemAnalyzer()).setOpenMode(IndexWriterConfig.OpenMode.CREATE))) {
database.getEntries().stream().
filter(entry -> entry.hasField(FieldName.FILE)).
filter(entry -> entry.getCiteKeyOptional().isPresent()).
forEach(entry -> writeToIndex(entry, indexWriter));
} catch (IOException e) {
LOGGER.warn(e.getMessage());
}
}

/**
* Adds all the pdf files linked to one entry in the database to an existing (or new) Lucene search index
*
* @param entry a bibtex entry to link the pdf files to
*/
public void addToIndex(BibEntry entry) {
try (IndexWriter indexWriter = new IndexWriter(directoryToIndex,
new IndexWriterConfig(new EnglishStemAnalyzer()).setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND))) {

if (entry.hasField(FieldName.FILE) && entry.getCiteKeyOptional().isPresent()) {
writeToIndex(entry, indexWriter);
}
} catch (IOException e) {
LOGGER.warn(e.getMessage());
}
}

/**
* Deletes all entries from the Lucene search index.
*/
public void flushIndex() {

IndexWriterConfig config = new IndexWriterConfig();
config.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
try (IndexWriter deleter = new IndexWriter(directoryToIndex, config)) {
// Do nothing. Index is deleted.
return;
calixtus marked this conversation as resolved.
Show resolved Hide resolved
} catch (IOException e) {
LOGGER.warn(e.getMessage());
}
}

private void writeToIndex(BibEntry entry, IndexWriter indexWriter) {
try {
indexWriter.addDocument(new DocumentReader(entry).readPdfContents());
} catch (IOException e) {
LOGGER.debug("Document could not be added to the index.", e);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
package org.jabref.logic.pdf.search.retrieval;

import java.io.IOException;
import java.nio.file.Paths;
import java.util.LinkedList;
import java.util.List;
import java.util.Objects;

import org.jabref.logic.pdf.search.indexing.EnglishStemAnalyzer;
import org.jabref.model.pdf.search.PdfSearchResults;
import org.jabref.model.pdf.search.SearchResult;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.queryparser.classic.MultiFieldQueryParser;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.SimpleFSDirectory;

import static org.jabref.model.pdf.search.SearchFieldConstants.PDF_FIELDS;

public final class PdfSearcher {
private static final Log LOGGER = LogFactory.getLog(PdfSearcher.class);

private final Directory indexDirectory;

public PdfSearcher() throws IOException {
this.indexDirectory = new SimpleFSDirectory(Paths.get("src/main/resources/luceneIndex"));
}

/**
* Search for results matching a query in the Lucene search index
*
* @param searchString a pattern to search for matching entries in the index, must not be null
* @param maxHits number of maximum search results, must be positive
* @return a result set of all documents that have matches in any fields
*/
public PdfSearchResults search(String searchString, int maxHits) throws IOException {
if (Objects.requireNonNull(searchString, "The search string was null!").isEmpty()) {
btut marked this conversation as resolved.
Show resolved Hide resolved
return new PdfSearchResults();
}
if (maxHits <= 0) {
throw new IllegalArgumentException("Must be called with at least 1 maxHits, was" + maxHits);
}

try {
List<SearchResult> resultDocs = new LinkedList<>();

IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(indexDirectory));
Query query = new MultiFieldQueryParser(PDF_FIELDS, new EnglishStemAnalyzer()).parse(searchString);
for (ScoreDoc scoreDoc : searcher.search(query, maxHits).scoreDocs) {
resultDocs.add(new SearchResult(searcher, scoreDoc));
}
return new PdfSearchResults(resultDocs);
} catch (ParseException e) {
LOGGER.warn("Could not parse query: '" + searchString + "'! \n" + e.getMessage());
return new PdfSearchResults();
}
}
}
45 changes: 45 additions & 0 deletions src/main/java/org/jabref/model/pdf/search/PdfSearchResults.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
package org.jabref.model.pdf.search;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public final class PdfSearchResults {

private final List<SearchResult> searchResults;

public PdfSearchResults(List<SearchResult> search) {
this.searchResults = Collections.unmodifiableList(search);
}

public PdfSearchResults() {
this.searchResults = Collections.unmodifiableList(Collections.emptyList());
btut marked this conversation as resolved.
Show resolved Hide resolved
}

public List<SearchResult> getSortedByScore() {
List<SearchResult> sortedList = new ArrayList<>(searchResults);
sortedList.sort((searchResult, t1) -> {
if (searchResult.getLuceneScore() < t1.getLuceneScore()) {
return -1;
}
if (searchResult.getLuceneScore() > t1.getLuceneScore()) {
return 1;
}
return 0;
});
btut marked this conversation as resolved.
Show resolved Hide resolved
return Collections.unmodifiableList(sortedList);
}

private List<SearchResult> getSortedByAlphabet() {
//TODO implement sorting
throw new RuntimeException("Not implemented");
}

public List<SearchResult> getSearchResults() {
return this.searchResults;
}

public int numSearchResults() {
return this.searchResults.size();
}
}
Loading