8360037: Refactor ImageReader in preparation for Valhalla support #26054

david-beaumont · 2025-06-30T23:27:49Z

Refactoring ImageReader to make it easy to add preview mode functionality for Valhalla.

This PR is a large change to ImageReader (effectively a rewrite) but reduces the surface area of the API significantly, reduces code complexity and increases performance/memory efficiency. The need for this sort of change comes from the need to support "preview mode" in Valhalla for classes loaded from core modules.

Preview Mode

In the proposed preview mode support for Valhalla, a resource with the name /modules/<module>/<path> may come from one of two locations in the underlying jimage file; /<module>/<path> or /<module>/META-INF/preview/<path>. Furthermore, directories (represented as directory nodes via the API) will have a potentially different number of child nodes depending on whether preview mode is in effect, and some directories may only exist at all in preview mode.

Furthermore, the directories and symbolic link nodes in /packages/... will also be affected by the existence of new package directories. To retain consistency and avoid "surprises" later, all of this needs to be taken into account.

Below is a summary of the main changes for mainline JDK to better support preview mode later:

1: Improved encapsulation for preview mode

The current ImageReader code exposes the data from the jimage file in several ways; via the Node abstraction, but also with methods which return an ImageLocation instance directly. In preview mode it would not be acceptable to return the ImageLocation, since that would leak the existence of resources in the META-INF/preview namespace and lets users see resource nodes with names that don't match the underlying ImageLocation from which their contents come.

As such, the PR removes all methods which can leak ImageLocation or other "underlying" semantics about the jimage file. Luckily most of these are either used minimally and easily migrated to using nodes, or they were not being used at all. This PR also removes any methods which were already unused across the OpenJDK codebase (if I discover any issues with over-trimming of the API during full CI testing, it will be easy to address).

2. Simplification of directory child node handling

The current ImageReader code attempts to create parent directories "on demand" for any child nodes it creates. This results in parent directories having a non-empty but incomplete set of child nodes present. This is referred to as the directory being "incomplete" and requires users of the API to understand this and potentially "complete" these nodes by calling back into the ImageReader API.

This approach is an attempt to provide lazy node creation (which is a necessary trait since there are ~31,000 nodes in a typical jimage hierarchy) but isn't actually necessary since a child node has no "get parent" functionality and simply not creating parent nodes at the point is functionally equivalent.

The partially complete directory nodes also pose an issue for preview mode support, making it more complex to reason about where and when preview child nodes should be applied.

This PR simplifies directory node handling in two ways:

Directory nodes are either empty (incomplete) or have all child nodes present.
Directory nodes no longer have an API to return their (possibly incomplete) child nodes to users.

This results in simpler to reason about code for directory management and simpler changes later to support preview mode. Removing the on-demand parent directory creation and the partial child list management also has a noticeable performance improvement (based on the new ImageReaderBenchmark in pr/26044).

3. More comments and code assertions

Since ImageReader is a critical bit of internal tooling for OpenJDK, I made sure to add significant comments explaining the behaviour, as well as adding many new in-code assert statements. The version of ImageReader in this PR is 70 lines shorter than the current version, but if you account for new comments, it's really a reduction of almost 40% (640 -> 400 lines of code) and has over 200 more comment lines.

4. New tests

I added a new unit test for ImageReader since, until now, its API was not really being tested directly. These tests are useful now, but will really help when preview mode is added in Valhalla, since then there will need to be significant edge-case testing. I'm happy to improve or change tests in this PR, but they definitely cover the main cases.

5. Adding TODO comments for review discussion.

There are obviously some open questions about the exact design of the APIs, and some questions around behaviour. To make reviewing easier, I've added inline TODOs which are there to illicit feedback. I will account for an remove all of these before the PR is integrated, but I want reviewers to read and comment on them (even if there is no change expected).

Performance Improvements

On my laptop (not objectively interesting, but good for comparison) I am seeing a significant performance improvement over all benchmarks and a reduction in timing variability. These benchmarks are awkward because of the need to test things in a "cold start" state, so timings will have larger variability than expected, but the performance improvements are consistent and non trivial:

Run with make test TEST="micro:jdk.internal.jrtfs.ImageReaderBenchmark" using the benchmark currently being added in pr/26044:

Current version:

Benchmark                                            Mode  Cnt   Score    Error  Units
ImageReaderBenchmark.warmCache_CountAllNodes         avgt    5   0.784 ±  0.023  ms/op
ImageReaderBenchmark.coldStart_CountOnly               ss    5  37.910 ± 28.554  ms/op
ImageReaderBenchmark.coldStart_InitAndCount            ss    5  37.471 ± 18.775  ms/op
ImageReaderBenchmark.coldStart_LoadJavacInitClasses    ss    5   4.279 ±  7.959  ms/op

This PR:

Benchmark                                            Mode  Cnt   Score   Error  Units
ImageReaderBenchmark.warmCache_CountAllNodes         avgt   10   0.880 ± 0.086  ms/op
ImageReaderBenchmark.coldStart_CountOnly               ss   10  14.303 ± 3.975  ms/op
ImageReaderBenchmark.coldStart_InitAndCount            ss   10  14.032 ± 3.068  ms/op
ImageReaderBenchmark.coldStart_LoadJavacInitClasses    ss   10   2.530 ± 0.348  ms/op

Shows a >2.5x speedup for traversing all nodes in a "cold start" state, and ~1.7x speedup for loading the core set of classes needed to start javac.

Progress

Change must not contain extraneous whitespace
Commit message must refer to an issue
Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Integration blocker

⚠️ Dependency #26044 must be integrated first

Issue

JDK-8360037: Refactor ImageReader in preparation for Valhalla support (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26054/head:pull/26054
$ git checkout pull/26054

Update a local copy of the PR:
$ git checkout pull/26054
$ git pull https://git.openjdk.org/jdk.git pull/26054/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26054

View PR using the GUI difftool:
$ git pr show -t 26054

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26054.diff

Using Webrev

Link to Webrev Comment

* Stacking JDK-8359808 (dependent PR) changes on top temporarily. * Removing the last misleading/incorrect API from ImageReader. * WIP - looking at adding tests. * Removed all APIs which leak ImageLocation to ImageReader users.

* Tidying comments, removing dead methods and adding more todos for review discussion. * Moving and merging methods for better ordering, and tidying comments. * Tidying tests. * Reworking to fix symbolic link node creation. * Minor tidy up of comments. * Updating tests for new API. * Stacking JDK-8359808 (dependent PR) changes on top temporarily.

bridgekeeper · 2025-06-30T23:28:31Z

👋 Welcome back david-beaumont! A progress list of the required criteria for merging this PR into pr/26044 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-06-30T23:28:38Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-06-30T23:28:58Z

@david-beaumont this pull request can not be integrated into pr/26044 due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout jdk_8360037_reader
git fetch https://git.openjdk.org/jdk.git pr/26044
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge pr/26044"
git push

openjdk · 2025-06-30T23:29:18Z

@david-beaumont The following labels will be automatically applied to this pull request:

build
core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

david-beaumont

Anything marked "TODO" is for discussion during the coming review. I don't intend to leave any of these in the code after, but they raise specific issues I'd like address.

david-beaumont · 2025-06-30T23:34:23Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

@@ -92,143 +131,93 @@ private void requireOpen() {
        }
    }

-    // directory management interface


A large number of the methods below were either:

Unused or effectively no-ops.

Breaking encapsulation and leaking underlying types such as ImageLocation (not a big problem now, but prevents cleanly implementing the preview mode logic for Valhalla).

The new API has essentially 3 top level methods here:

findNode()

getResource(Node)

getResourceBuffer(Node)

Any navigation of the node hierarchy is done via getChildNames() and unlike now, no user can obtain a reference to an "incomplete" node.

david-beaumont

With these pre-preview comments, I think this is finally ready for review.

david-beaumont · 2025-07-01T13:47:29Z

make/test/BuildMicrobenchmark.gmk

Ignore this file, it's part of the PR to add the benchmark. I'll merge and sort everything out once that's in.

david-beaumont · 2025-07-01T13:49:21Z

src/java.base/share/classes/jdk/internal/jrtfs/ExplodedImage.java

@@ -119,9 +119,9 @@ byte[] getContent() throws IOException {
        }

        @Override
-        public List<Node> getChildren() {


ExplodedImage makes a custom subclass of Node to "fake" the ImageReader behaviour. While this will need to change for preview mode, for now it's enough to just modify the one affected API.

david-beaumont · 2025-07-01T13:50:20Z

src/java.base/share/classes/jdk/internal/jrtfs/JrtFileAttributes.java

@@ -49,7 +49,7 @@ final class JrtFileAttributes  implements BasicFileAttributes {
    //-------- basic attributes --------
    @Override
    public FileTime creationTime() {
-        return node.creationTime();


There's no benefit to having all these one-line getter methods duplicated in ImageReader since it already provides the file attributes object directly.

david-beaumont · 2025-07-01T13:51:49Z

src/java.base/share/classes/jdk/internal/jrtfs/JrtFileSystem.java

@@ -225,19 +223,19 @@ Iterator<Path> iteratorOf(JrtPath path, DirectoryStream.Filter<? super Path> fil
            throw new NotDirectoryException(path.getName());
        }
        if (filter == null) {
-            return node.getChildren()


Same logic, just based on the child names instead of the nodes.

david-beaumont · 2025-07-01T13:54:09Z

src/java.base/share/classes/jdk/internal/jrtfs/SystemImage.java

@@ -58,7 +58,6 @@ static SystemImage open() throws IOException {
        if (modulesImageExists) {
            // open a .jimage and build directory structure
            final ImageReader image = ImageReader.open(moduleImageFile);
-            image.getRootDirectory();


This method no longer serves any purpose (it used to initialize the root directory entries, but it's not needed now.

david-beaumont · 2025-07-01T14:05:16Z

test/jdk/tools/jimage/ImageReaderDuplicateChildNodesTest.java

Just uses child names rather than nodes.

david-beaumont · 2025-07-01T14:07:36Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

I think that this class is best reviewed as if it were a new implementation, rather than trying to reason about the specific changes between the versions. Hopefully the comments will make the design clear and the reduced complexity/lines-of-code will help it be understood in a more stand-alone way.

david-beaumont · 2025-07-01T14:08:51Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

-        return reader.getHeader();
-    }
-
+    /**


This method and the getResourceBuffer() method below are, I think, only called in one place and could, in theory, be factored out of this class altogether.

david-beaumont · 2025-07-01T14:09:49Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

@@ -263,8 +252,8 @@ public void close(ImageReader image) throws IOException {

                if (openers.isEmpty()) {
                    close();
+                    // TODO (review note): Should this be synchronized by "this" ??


I genuinely this this might be an existing issue.

david-beaumont · 2025-07-01T14:11:15Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

        protected Node(String name, BasicFileAttributes fileAttrs) {
            this.name = Objects.requireNonNull(name);
            this.fileAttrs = Objects.requireNonNull(fileAttrs);
        }

-        /**


Most of these were either never called or no longer needed in the new implementation.

mlbridge · 2025-07-01T14:18:18Z

Webrevs

01: Full - Incremental (29633a12)
00: Full (d8efaa09)

AlanBateman · 2025-07-01T14:22:13Z

/reviewers 2

openjdk · 2025-07-01T14:22:39Z

@AlanBateman
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

AlanBateman · 2025-07-01T14:29:28Z

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java

 import jdk.internal.jimage.ImageReader;
 import jdk.internal.jimage.ImageReaderFactory;
 import jdk.internal.access.JavaNetUriAccess;
 import jdk.internal.access.SharedSecrets;
 import jdk.internal.util.StaticProperty;
 import jdk.internal.module.ModuleHashes.HashSupplier;

+import static java.util.Objects.requireNonNull;


The existing Objects.requireNonNull are okay, no need to change them all as it is nothing to do with the changes.

AlanBateman · 2025-07-01T14:29:51Z

test/micro/org/openjdk/bench/jdk/internal/jrtfs/ImageReaderBenchmark.java

+@State(Scope.Benchmark)
+@OutputTimeUnit(TimeUnit.MILLISECONDS)
+@Fork(value = 1, jvmArgs = {"--add-exports", "java.base/jdk.internal.jimage=ALL-UNNAMED"})
+public class ImageReaderBenchmark {


This is the benchmark from the other PR, did you mean to include it?

AlanBateman · 2025-07-01T14:32:27Z

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java

@@ -253,6 +248,26 @@ private static ModuleFinder ofModuleInfos() {
        return new SystemModuleFinder(mrefs, nameToModule);
    }

+    private static Stream<ModuleInfo.Attributes> getModuleAttributes() {


Can you rename this to allModuleAttributes and add a method description to match the other methods.

AlanBateman · 2025-07-01T14:33:31Z

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java

+        return getNode(reader, "/modules")
+                .getChildNames()
+                .map(mn -> getNode(reader, mn + "/module-info.class"))
+                // This fails with ISE if the node isn't a resource (corrupt JImage).


Can you drop this comment and check getNode to thrown an Error (ISE isn't right when we have a broken image).

getNode() is called twice and while it can throw the error, there's no nice way for getResourceBuffer() to be that strict (it's a separate API).
After fiddling a bit I reworked the module attribute loading into a separate helper and caught & escalated any exceptions relating to that into Error.
The only question now is that ModuleInfo.read() has at least one other runtime exception (InvalidModuleDescriptorException) it can throw which we could catch here. Should I add that to the list?

AlanBateman · 2025-07-01T14:37:50Z

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java

+            String nodeName = "/modules/" + module + "/" + name;
+            Optional<ImageReader.Node> node = Optional.ofNullable(reader.findNode(nodeName));
+            if (node.isPresent() && !node.get().isResource()) {
+                throw new IllegalStateException("Not a resource node: " + node.get());


The IllegalStateException is problematic here, maybe you mean to add an assert?

This is called externally so cannot trust the given name really references a resource.
I can't therefore make it an assertion, but I can just return empty if it didn't point at a resource (even if it did point at a directory etc.). Or I can make it an Error.

I made it return empty for now and documented it as such.

ExE-Boss · 2025-07-01T14:32:29Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

+            this.modulesStringOffset = findLocation("/modules/java.base").getModuleOffset();
+            this.packagesStringOffset = findLocation("/packages/java.lang").getModuleOffset();
+
+            // Node creation is very lazy, se can just make the top-level directories


Typo:

Suggested change

// Node creation is very lazy, se can just make the top-level directories

// Node creation is very lazy, so can just make the top-level directories

ExE-Boss · 2025-07-01T14:32:41Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

            this.nodes = new HashMap<>();
+            // TODO (review note): These should exist under all circumstances, but there's
+            //  probably a more robust way of getting it these offsets.


Suggested change

// probably a more robust way of getting it these offsets.

// probably a more robust way of getting these offsets.

ExE-Boss · 2025-07-01T14:36:23Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

+        // A node is completed when all its direct children have been built. As
+        // such, non-directory nodes are always complete.


I think this might flow better:

Suggested change

// A node is completed when all its direct children have been built. As

// such, non-directory nodes are always complete.

// A node is completed when all its direct children have been built.

// As such, non-directory nodes are always complete.

RogerRiggs · 2025-07-01T18:30:56Z

src/java.base/share/classes/jdk/internal/jimage/ImageReader.java

        }

+        /**
+         * Returns the JRY file system compatible name of this node (e.g.


Typo: "JRY"

RogerRiggs · 2025-07-01T18:38:26Z

src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java

@@ -274,7 +289,7 @@ private static class SystemModuleFinder implements ModuleFinder {

        @Override
        public Optional<ModuleReference> find(String name) {
-            Objects.requireNonNull(name);
+            requireNonNull(name);


Personal preference is avoid static imports making the code readable by retaining the "Objects." qualifier otherwise the non-local method call appears out of thin air. YMMV.

Yeah, given a couple of qualified ones existed here I'll qualify them all.
Personally I've evolved my opinion on this from preferring qualified use to thinking that the static import is more concise (and no less clear given the very unambiguous name).

RogerRiggs · 2025-07-01T18:45:49Z

test/jdk/jdk/internal/jimage/ImageReaderTest.java

+    private final Path image = buildJImage(IMAGE_ENTRIES);
+
+    @Test
+    public void testModuleDirectories() throws IOException {


yes, please, it will make failures easier to identity.

magicus · 2025-07-01T18:58:42Z

/label -build

openjdk · 2025-07-01T18:59:42Z

@magicus
The build label was successfully removed.

david-beaumont added 8 commits June 12, 2025 18:03

Reapply diffs.

6fb20c8

WIP - more tests needed.

37642fe

* Stacking JDK-8359808 (dependent PR) changes on top temporarily. * Removing the last misleading/incorrect API from ImageReader. * WIP - looking at adding tests. * Removed all APIs which leak ImageLocation to ImageReader users.

Newline at EOF.

2f7e175

syncing to main branch

cef0a62

undo dependent changes

e551c3e

Merge branch 'master' into jdk_8355953_reader/squashed

1ba3370

Integrating with ImageReaderBenchmark.

d8efaa0

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jun 30, 2025

openjdk bot added build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Jun 30, 2025

david-beaumont commented Jun 30, 2025

View reviewed changes

david-beaumont commented Jul 1, 2025

View reviewed changes

david-beaumont marked this pull request as ready for review July 1, 2025 14:13

openjdk bot added the rfr Pull request is ready for review label Jul 1, 2025

AlanBateman reviewed Jul 1, 2025

View reviewed changes

ExE-Boss reviewed Jul 1, 2025

View reviewed changes

RogerRiggs reviewed Jul 1, 2025

View reviewed changes

openjdk bot removed the build build-dev@openjdk.org label Jul 1, 2025

Updates based on feedback.

29633a1

	// Node creation is very lazy, se can just make the top-level directories
	// Node creation is very lazy, so can just make the top-level directories

	// probably a more robust way of getting it these offsets.
	// probably a more robust way of getting these offsets.

		// A node is completed when all its direct children have been built. As
		// such, non-directory nodes are always complete.

8360037: Refactor ImageReader in preparation for Valhalla support #26054

Are you sure you want to change the base?

8360037: Refactor ImageReader in preparation for Valhalla support #26054

Conversation

david-beaumont commented Jun 30, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview Mode

1: Improved encapsulation for preview mode

2. Simplification of directory child node handling

3. More comments and code assertions

4. New tests

5. Adding TODO comments for review discussion.

Performance Improvements

Progress

Integration blocker

Issue

Reviewing

Uh oh!

bridgekeeper bot commented Jun 30, 2025

Uh oh!

openjdk bot commented Jun 30, 2025

Uh oh!

openjdk bot commented Jun 30, 2025

Uh oh!

openjdk bot commented Jun 30, 2025

Uh oh!

david-beaumont left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-beaumont left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mlbridge bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

AlanBateman commented Jul 1, 2025

Uh oh!

openjdk bot commented Jul 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-beaumont Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

david-beaumont commented Jun 30, 2025 •

edited by openjdk bot

Loading

mlbridge bot commented Jul 1, 2025 •

edited

Loading

david-beaumont Jul 1, 2025 •

edited

Loading

david-beaumont Jul 1, 2025 •

edited

Loading