Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

storage: keep a certain amount of pack files open by default #1109

Closed
wants to merge 1 commit into from

Conversation

saracen
Copy link
Contributor

@saracen saracen commented Apr 14, 2019

The MaxOpenDescriptors option provides a middle ground solution between keeping all pack files open (as offered by the KeepDescriptors option) and keeping none open.

By default, 100 pack files can be kept open. Files are closed in a FIFO fashion, so the oldest files opened are the first closed.

In addition to this, this change uses the packfile's hash as the map key, instead of the path name. This results in a nice performance boost when iterating over many files, as objectPackPath (using path.Join) isn't as fast as you'd imagine. If this PR isn't accepted, that modification will probably still be worth doing.

The MaxOpenDescriptors being set to 100 is pretty arbitrary, but it feels like lots of people get a large performance penalty when they don't enable KeepDescriptors, and I guess enabling that by default is going to lead to max file open errors.

@saracen saracen force-pushed the pack-max-open-descriptors branch 4 times, most recently from cfff565 to 64f2105 Compare April 17, 2019 14:38
The MaxOpenDescriptors option provides a middle ground solution between keeping
all pack files open (as offered by the KeepDescriptors option) and keeping none
open.

By default, 100 pack files can be kept open.

Signed-off-by: Arran Walker <arran.walker@fiveturns.org>
@mcuadros mcuadros requested a review from jfontan April 18, 2019 07:17
Copy link
Contributor

@jfontan jfontan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a regression in windows:

----------------------------------------------------------------------
FAIL: submodule_test.go:43: SubmoduleSuite.TearDownTest
submodule_test.go:45:
    c.Assert(err, IsNil)
... value *os.PathError = &os.PathError{Op:"remove", Path:"C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\submodule878391303\\worktree\\.git\\objects\\pack\\pack-378c30970ead9057749c21a80e4d410fac9250e4.pack", Err:0x20} ("remove C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\submodule878391303\\worktree\\.git\\objects\\pack\\pack-378c30970ead9057749c21a80e4d410fac9250e4.pack: The process cannot access the file because it is being used by another process.")
----------------------------------------------------------------------
PANIC: submodule_test.go:199: SubmoduleSuite.TestGitSubmodulesSymlink
... Panic: Fixture has panicked (see related PANIC)
OOPS: 241 passed, 2 skipped, 1 FAILED, 11 MISSED

Setting this value by default can cause problems as now it changes the behavior. In POSIX this is not a problem as you can delete opened files but Windows does not let you to do that.

I believe we cannot enable it by default.

@saracen
Copy link
Contributor Author

saracen commented Apr 18, 2019

Yeah, I see the problem. It's a shame there's no repository.Close(), but even adding that now would modify current behavior (as it would close the storage passed in via Init()).

I'll modify the PR shortly to remove enabling by default.

@saracen
Copy link
Contributor Author

saracen commented Apr 19, 2019

I'm wondering now if it makes more sense to move this sort of temporary caching functionality into ObjectStorage. With it there, we could keep a list of Packfile objects, rather than just the underlying files. Is there any reason not to do this?

KeepDescriptors as an option in DotGit is maybe not the best fit? DotGit isn't actually responsible for closing the files. The only time it closes them, is on DotGit.Close(), and that happens even when KeepDescriptors is true. It might work better if it was the callers responsibility to maintain a list of open files returned from ObjectPack if they wish to.

I'll start to work on another PR that keeps KeepDescriptors in DotGit (for backwards compatibility), but marks as deprecated, and moves that functionality, and the functionality of this PR into ObjectStorage, unless anybody thinks that's a bad idea?

@saracen
Copy link
Contributor Author

saracen commented Apr 21, 2019

@jfontan I've moved this to #1123, where the whole packfile object is kept around, rather than just the file based on the reasoning in my last comment here.

@saracen saracen closed this Apr 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants