Skip to content

OCR integration #13313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Kaan0029
Copy link
Contributor

Closes #13267

In about one to three sentences, describe the changes you have made: what, where, why, ...

Steps to test

Describe how reviewers can test this fix/feature. Ideally, think of how you would guide a beginner user of Jabef to try out your change.
You can add screenshots or videos (using Loom or by just adding .mp4 files).

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • [.] Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • [.] Tests created for changes (if applicable)
  • [.] Manually tested changed features in running JabRef (always required)
  • [.] Screenshots added in PR description (if change is visible to the user)
  • [.] Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • [.] Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@calixtus calixtus changed the title Initial implementation using tess4j OCR integration Jun 12, 2025
@jabref-machine
Copy link
Collaborator

Note that your PR will not be reviewed/accepted until you have gone through the mandatory checks in the description and marked each of them them exactly in the format of [x] (done), [ ] (not done yet) or [/] (not applicable).

Comment on lines 53 to 56
File pdfFile = pdfPath.toFile();
if (!pdfFile.exists()) {
throw new OcrException("PDF file does not exist: " + pdfPath);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jabref-machine
Copy link
Collaborator

JUnit tests of jabsrv are failing. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Tests / Unit tests (pull_request)" and click on it.

You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.

* Currently uses Tesseract with English language support.
*/
public OcrService() {
if (Platform.isMac()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* This exception wraps lower-level OCR engine exceptions to provide
* a consistent interface for error handling throughout JabRef.
*/
public class OcrException extends Exception {
Copy link
Member

@subhramit subhramit Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, only drawback I see is this may be hard to maintain/stay consistent with as the project grows or if external contributors wish to add something...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best is to avoid exceptions being thrown at all, especially for expected possible stupid user behaviour. And if they are being thrown, keep them as informative as possible. The consistent interface is Exception at the end either way.

Copy link
Member

@subhramit subhramit Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backstory: This was based on my experience in OO - all the JStyle classes and the OO GUI class had a custom error wrapper, OOError. There was also OOResult and OOVoidResult. I still don't know much about them, but for OOError, apart from just wrapping exceptions, it also acted as an interface for with localizing and displaying error messages. As obvious, when I was new, I found these hard to use. So I started by using native exceptions as thrown by the library methods during the CSL project.
It was convenient, and it worked, and people were fine with the inconsistency, so now 50% of OO uses that, 50% doesn't. I mentioned this as some free time refactoring in #11829, but never had the energy to change it.

Comment on lines +16 to +18
public static OcrResult success(String text) {
return new OcrResult(true, text, null);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factory method passes null explicitly as a parameter. This violates the principle of not passing null to methods. Consider using Optional or restructuring to avoid null parameter.

Comment on lines +20 to +22
public static OcrResult failure(String errorMessage) {
return new OcrResult(false, null, errorMessage);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factory method passes null explicitly as a parameter. This violates the principle of not passing null to methods. Consider using Optional or restructuring to avoid null parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional should only be a return type

Comment on lines +24 to +25
// The OCR engine instance
private final Tesseract tesseract;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is trivial and can be derived directly from the code. It doesn't add any new information about the implementation or reasoning behind using Tesseract.

// For now, we'll use a relative path that works during development
tesseract.setDatapath("tessdata");
for (String path : possiblePaths) {
File tessdata = new File(path, "tessdata");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use java.nio with Path and File.exists(...) https://docs.oracle.com/javase/tutorial/essential/io/legacy.html

private final String text;
private final String errorMessage;

private OcrResult(boolean success, String text, String errorMessage) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Localization.lang("OCR failed"),
exception.getMessage()
);
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before add
.showToUser(true) then it will be shown in the UI task list

Comment on lines +10 to +17
private OcrResult(boolean success, String text, String errorMessage) {
this.success = success;
this.text = text;
this.errorMessage = errorMessage;
}

public static OcrResult success(String text) {
return new OcrResult(true, text, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a better way to think about the constructor parameters. Maybe have different constructors to avoid passing null (as both success and failure are very expected cases)?

Copy link
Member

@Siedlerchr Siedlerchr Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other idea is to model a sealed class hierarchy with sucess/failure as simple record and then you can do pattern matching with switch ...
https://medium.com/@sandip.v.salunkhe/java-record-and-sealed-classes-features-to-enhance-modelling-data-35705c571f70

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +82 to +83
} else {
throw new OcrException("Could not find tessdata directory. Please set TESSDATA_PREFIX environment variable.");
Copy link
Member

@subhramit subhramit Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid logical branching to throw exceptions.
Can this be modeled with try-catch?
If not, and if this is an expected case, there should just be an error dialog shown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GSOC meta issue: OCR Integration in JabRef
5 participants