Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iso] Improve ISO extraction performance by preallocating files prior to writing #1170

Closed
wants to merge 1 commit into from

Conversation

Mattiwatti
Copy link
Contributor

The problem here seems to be the pattern of issuing many synchronous WriteFile calls on the same file handle with small buffers that are limited to the block size of the UDF/ISO 9660 reads. This leads to fragmentation because the target filesystem cannot always keep a file contiguous when its size continuously increases like this.

This commit basically inserts what SetEndOfFile() would do if the file pointer was not 0 in order to force an allocation so that the filesystem can make a better decision about where to place the file and doesn't have to keep moving it around later. A different way to achieve the same thing would be to just read the entire file into memory and then issuing a single WriteFile. But doing that would overcomplicate things for what it's trying to solve IMO (do you really want to allocate 4GB for install.wim? Can you? Not to mention keeping track of whether this buffer has already been freed or not along the various gotos. Etc.)

I tested this patch on a Windows 7 machine with a 64GB USB 3.1 Kingston HyperX, and a Windows 10 machine with a 32GB USB 3.0 Sandisk Extreme. I tried a variety of ISOs, ranging from modern well behaved UEFI ones to pathological cases like the XP setup ISO which contains 8000 files. The only thing I learned is that it's impossible to benchmark any kind of I/O reliably on a modern OS. Even repeating something you just did without changing anything can give a completely different runtime due to caching alone. If anyone knows how this magic can be performed properly, please let me know. All I can say is that I tried really hard to make the original version beat the patched one and the worst result I managed was still a 5% improvement. On Windows 10 I'd guesstimate this to be somewhere in the ballpark of 10% faster on average, and a bit more than that on Windows 7. A lot more if you use plain malicious input like that XP setup disc.

…g the file size of each file on the target filesystem before writing to it

The actual benefit gained from this will vary: it depends on the input ISO layout (with a large number of small files giving a bigger improvement), the target filesystem, and also the host OS because of driver differences between versions. Windows 7 seems to benefit more than Windows 10, and NTFS more than FAT32. All scenarios should see at least a minor speed gain from this however. (No actual percentage claims made upon urgent request of my lawyer)
@pbatard
Copy link
Owner

pbatard commented Jun 24, 2018

Very nice!

I wasn't aware of this trick and it makes a lot of sense to have it in Rufus indeed.

When I get a chance (might be in a day or two), I'll validate it and integrate your commit to the codebase.

With regards to speedups, the other trick I had thought of was, if there's enough free RAM available (and the user agrees to using that feature, which would of course be optional), to start copying the whole ISO to memory as soon as the user has selected it, and then work with the in-memory copy. A bit similar to your other suggestion, the idea, since most ISO file content is going to be sequential on disc (though UDF can have sparse files, which complicates matters), would be to be able to write the whole file in one go by simply pointing the write buffer to the RAM-cached sequential ISO data.

Of course, this presents some new issues (for instance, what should we do if the user clicked START before the file blocks we are interested in have been cached in RAM? If we're dealing with a 4GB install.wim, and the user was quick on launching the operation, or the disk they have the image on is slow, it makes little sense to wait for the last install.wim block to be available in RAM before starting to write that file to the USB, as the whole operation is likely to be much slower then), so I didn't look that seriously into it.

Eventually, the best approach to speeding things up, since we are reformatting the drive anyway, would be to both "manually" create the file system (allocation tables, directory entries and so on) and copy data at the same time, by having Rufus provide its own low-level FAT32 or NTFS handling. But of course, that' would be a whole different and super time-consuming endeavour...

At any rate, thank you very much for submitting this commit. I'll make sure it gets integrated in Rufus 3.2!

@pbatard pbatard self-assigned this Jun 24, 2018
@pbatard pbatard added this to the 3.2 milestone Jun 24, 2018
@pbatard
Copy link
Owner

pbatard commented Jun 25, 2018

Applied. Thanks!

@pbatard pbatard closed this in d4a4506 Jun 25, 2018
@Mattiwatti Mattiwatti deleted the defraggler branch June 26, 2018 12:03
@Mattiwatti
Copy link
Contributor Author

Mattiwatti commented Jun 27, 2018

Yes, I thought the same about the first approach you mentioned. It is very much a gamble depending on the layout of the ISO. In general I don't think there would be that much benefit anyway because by definition the input ISO is always going to be on a different physical disk than the target disk (since the first thing Rufus does is wipe the target disk).

Your last suggestion is indeed very interesting because ideally, you would be able to construct the target file system completely in memory and then write it to the USB disk in one go, dd style. But this would involve either writing a small filesystem driver (<-- this is an oxymoron), or making some kind of ramdisk and doing file operations on it like on a normal disk, and then afterward write to the target disk by copying the ramdisk as raw bytes. I think in both cases you would probably need a driver for this to work, unless I'm missing some other more obvious approach.

What puzzles me about this particular fix is that NTFS was visibly being fragmented by Rufus. Not much (1-2% in most cases) but 25% for the XP disk. After patch: 0% fragmentation. However for FAT32 I get 0% fragmentation either way. So my thinking was: both filesystems are aware of the write fragmentation occurring, but NTFS has a more lax policy towards this than FAT32 which defragments after itself (this can be seen by the CPU time spent in CloseHandle - this is where the FS does cleanup).

Possibly FAT32 read performance suffers so much from fragmentation that it is simply disallowed from happening, at the cost of write speed, whereas NTFS is OK with fragmentation to some degree. I don't know. But the funny thing is that it is actually NTFS that benefits more from this patch. This is the opposite of what I would expect, because FAT32 is the one that no longer has to defragment itself on the fly all the time.

@lock
Copy link

lock bot commented Apr 6, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue if you think you have a related problem or query.

@lock lock bot locked and limited conversation to collaborators Apr 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants