Skip to content

Commit

Permalink
Feature: SHA256 support (#1032)
Browse files Browse the repository at this point in the history
  • Loading branch information
emmercm committed Mar 23, 2024
1 parent 4aebb8d commit 49d1e7e
Show file tree
Hide file tree
Showing 22 changed files with 391 additions and 28 deletions.
2 changes: 1 addition & 1 deletion docs/alternatives.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ There are a few different popular ROM managers that have similar features:
| DATs: combine multiple |||||
| Archives: extraction formats | ✅ many formats ([reading archives docs](input/reading-archives.md)) |`.zip`, `.7z`, `.rar` | ⚠️ `.zip`, `.7z` | ⚠️ `.zip`, `.7z` |
| Archives: creation formats |`.zip` only by design ([writing archives docs](output/writing-archives.md)) |`.zip`, `.7z`, `.rar` | ⚠️ `.zip` (TorrentZip), `.7z` | ⚠️ `.zip`, `.7z` |
| ROMs: DAT matching strategies | ✅ CRC32+size, MD5, SHA1 | CRC32+size, MD5, SHA1 | CRC32+size, MD5, SHA1 ||
| ROMs: DAT matching strategies | ✅ CRC32+size, MD5, SHA1, SHA256 | ⚠️ CRC32+size, MD5, SHA1 | ⚠️ CRC32+size, MD5, SHA1 ||
| ROMs: CHD scanning || ⚠️ via chdman | ✅ v1-5 natively | ⚠️ v1-4 natively |
| ROMs: scan/checksum caching |||||
| ROMs: header parsing |||| ⚠️ via plugins |
Expand Down
2 changes: 1 addition & 1 deletion docs/input/reading-archives.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,6 @@ This is why `igir` uses `.zip` as its output archive of choice, `.zip` files are

## Checksum cache

It can be expensive to calculate checksums of files within archives, especially MD5 and SHA1. If `igir` needs to calculate a checksum that is not easily read from the archive (see above), it will cache the result in a file named `igir.cache`. This cached result will then be used as long as the input file's size and modified timestamp remain the same.
It can be expensive to calculate checksums of files within archives, especially MD5, SHA1, and SHA256. If `igir` needs to calculate a checksum that is not easily read from the archive (see above), it will cache the result in a file named `igir.cache`. This cached result will then be used as long as the input file's size and modified timestamp remain the same.

Caching can be disabled with the `--disable-cache` option, or you can safely delete `igir.cache` if it becomes too large.
17 changes: 8 additions & 9 deletions docs/roms/matching.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ For example, if you provide all of these DATs at once with the [`--dat <path>` o

!!! note

When generating a [dir2dat](../dats/dir2dat.md) with the `igir dir2dat` command, `igir` will calculate CRC32, MD5, and SHA1 information for every file. This helps ensure that the generated DAT has the most complete information it can.
When generating a [dir2dat](../dats/dir2dat.md) with the `igir dir2dat` command, `igir` will calculate CRC32, MD5, and SHA1 information for every file. This helps ensure that the generated DAT has the most complete information it can. You can additionally add SHA256 information with the option `igir [commands..] [options] --input-min-checksum SHA256` (below).

## Manually using other checksum algorithms

Expand All @@ -40,10 +40,8 @@ You can specify higher checksum algorithms with the `--input-min-checksum <algor

```shell
igir [commands..] [options] --input-min-checksum MD5
```

```shell
igir [commands..] [options] --input-min-checksum SHA1
igir [commands..] [options] --input-min-checksum SHA256
```

This option defines the _minimum_ checksum that will be used based on digest size (below). If not every ROM in every DAT provides the checksum you specify, `igir` may automatically calculate and match files based on a higher checksum (see above).
Expand All @@ -52,10 +50,11 @@ The reason you might want to do this is to have a higher confidence that found f

Here is a table that shows the keyspace for each checksum algorithm, where the higher number of bits reduces the chances of collisions:

| Algorithm | Digest size | Unique values | Example value |
|-----------|-------------|----------------------------|--------------------------------------------|
| CRC32 | 32 bits | 2^32 = 4.29 billion | `30a184a7` |
| MD5 | 128 bits | 2^128 = 340.28 undecillion | `52bb8f12b27cebd672b1fd8a06145b1c` |
| SHA1 | 160 bits | 2^160 = 1.46 quindecillion | `666d29a15d92f62750dd665a06ce01fbd09eb98a` |
| Algorithm | Digest size | Unique values | Example value |
|-----------|-------------|-------------------------------------|--------------------------------------------------------------------|
| CRC32 | 32 bits | 2^32 = 4.29 billion | `30a184a7` |
| MD5 | 128 bits | 2^128 = 340.28 undecillion | `52bb8f12b27cebd672b1fd8a06145b1c` |
| SHA1 | 160 bits | 2^160 = 1.46 quindecillion | `666d29a15d92f62750dd665a06ce01fbd09eb98a` |
| SHA256 | 256 bits | 2^256 = 115.79 quattuorvigintillion | `1934e26cf69aa49978baac893ad5a890af35bdfb2c7a9393745f14dc89459137` |

When files are [tested](../commands.md#test) after being written, `igir` will use the highest checksum available from the scanned file to check the written file. This lets you have equal confidence that a file was written correctly as well as matched correctly.
2 changes: 1 addition & 1 deletion src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ export default class Constants {
static readonly FILE_READER_DEFAULT_THREADS = 10;

/**
* Max number of archive entries to process (possibly extract & MD5/SHA1 checksum) at once.
* Max number of archive entries to process (possibly extract & MD5/SHA1/SHA256 checksum) at once.
*/
static readonly ARCHIVE_ENTRY_SCANNER_THREADS_PER_ARCHIVE = 5;

Expand Down
3 changes: 2 additions & 1 deletion src/igir.ts
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,8 @@ export default class Igir {
Object.keys(ChecksumBitmask)
.filter((bitmask): bitmask is keyof typeof ChecksumBitmask => Number.isNaN(Number(bitmask)))
// Has not been enabled yet
.filter((bitmask) => ChecksumBitmask[bitmask] > minimumChecksum)
.filter((bitmask) => ChecksumBitmask[bitmask] >= ChecksumBitmask.CRC32)
.filter((bitmask) => ChecksumBitmask[bitmask] <= ChecksumBitmask.SHA1)
.filter((bitmask) => !(matchChecksum & ChecksumBitmask[bitmask]))
.forEach((bitmask) => {
matchChecksum |= ChecksumBitmask[bitmask];
Expand Down
18 changes: 12 additions & 6 deletions src/modules/candidateGenerator.ts
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,10 @@ export default class CandidateGenerator extends Module {
// If the input file is headered...
if (inputFile.getFileHeader()
// ...and we want a headered ROM
&& (inputFile.getCrc32() === rom.getCrc32()
|| inputFile.getMd5() === rom.getMd5()
|| inputFile.getSha1() === rom.getSha1())
&& ((inputFile.getCrc32() !== undefined && inputFile.getCrc32() === rom.getCrc32())
|| (inputFile.getMd5() !== undefined && inputFile.getMd5() === rom.getMd5())
|| (inputFile.getSha1() !== undefined && inputFile.getSha1() === rom.getSha1())
|| (inputFile.getSha256() !== undefined && inputFile.getSha256() === rom.getSha256()))
// ...and we shouldn't remove the header
&& !this.options.canRemoveHeader(
dat,
Expand All @@ -149,9 +150,10 @@ export default class CandidateGenerator extends Module {
// If the input file is headered...
if (inputFile.getFileHeader()
// ...and we DON'T want a headered ROM
&& !(inputFile.getCrc32() === rom.getCrc32()
|| inputFile.getMd5() === rom.getMd5()
|| inputFile.getSha1() === rom.getSha1())
&& !((inputFile.getCrc32() !== undefined && inputFile.getCrc32() === rom.getCrc32())
|| (inputFile.getMd5() !== undefined && inputFile.getMd5() === rom.getMd5())
|| (inputFile.getSha1() !== undefined && inputFile.getSha1() === rom.getSha1())
|| (inputFile.getSha256() !== undefined && inputFile.getSha256() === rom.getSha256()))
// ...and we're writing file links
&& this.options.shouldLink()
) {
Expand Down Expand Up @@ -308,11 +310,13 @@ export default class CandidateGenerator extends Module {
let outputFileCrc32 = inputFile.getCrc32();
let outputFileMd5 = inputFile.getMd5();
let outputFileSha1 = inputFile.getSha1();
let outputFileSha256 = inputFile.getSha256();
let outputFileSize = inputFile.getSize();
if (inputFile.getFileHeader()) {
outputFileCrc32 = inputFile.getCrc32WithoutHeader();
outputFileMd5 = inputFile.getMd5WithoutHeader();
outputFileSha1 = inputFile.getSha1WithoutHeader();
outputFileSha256 = inputFile.getSha256WithoutHeader();
outputFileSize = inputFile.getSizeWithoutHeader();
}

Expand All @@ -326,6 +330,7 @@ export default class CandidateGenerator extends Module {
crc32: outputFileCrc32,
md5: outputFileMd5,
sha1: outputFileSha1,
sha256: outputFileSha256,
});
}
// Otherwise, return a raw file
Expand All @@ -335,6 +340,7 @@ export default class CandidateGenerator extends Module {
crc32: outputFileCrc32,
md5: outputFileMd5,
sha1: outputFileSha1,
sha256: outputFileSha256,
});
}

Expand Down
12 changes: 12 additions & 0 deletions src/modules/candidateWriter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,12 @@ export default class CandidateWriter extends Module {
continue;
}
const actualFile = actualEntriesByPath.get(entryPath) as ArchiveEntry<Zip>;
if (actualFile.getSha256()
&& expectedFile.getSha256()
&& actualFile.getSha256() !== expectedFile.getSha256()
) {
return `has the SHA256 ${actualFile.getSha256()}, expected ${expectedFile.getSha256()}`;
}
if (actualFile.getSha1()
&& expectedFile.getSha1()
&& actualFile.getSha1() !== expectedFile.getSha1()
Expand Down Expand Up @@ -457,6 +463,12 @@ export default class CandidateWriter extends Module {
{ filePath: outputFilePath },
expectedFile.getChecksumBitmask(),
);
if (actualFile.getSha256()
&& expectedFile.getSha256()
&& actualFile.getSha256() !== expectedFile.getSha256()
) {
return `has the SHA256 ${actualFile.getSha256()}, expected ${expectedFile.getSha256()}`;
}
if (actualFile.getSha1()
&& expectedFile.getSha1()
&& actualFile.getSha1() !== expectedFile.getSha1()
Expand Down
1 change: 1 addition & 0 deletions src/modules/datGameInferrer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ export default class DATGameInferrer extends Module {
crc32: romFile.getCrc32(),
md5: romFile.getMd5(),
sha1: romFile.getSha1(),
sha256: romFile.getSha256(),
}))
.filter(ArrayPoly.filterUniqueMapped((rom) => rom.getName()));
return new Game({
Expand Down
3 changes: 3 additions & 0 deletions src/modules/datScanner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,7 @@ export default class DATScanner extends Scanner {
crc32: row.crc,
md5: row.md5,
sha1: row.sha1,
sha256: row.sha256,
});
const gameName = row.name.replace(/\.[^\\/]+$/, '');
return new Game({
Expand Down Expand Up @@ -403,6 +404,7 @@ export default class DATScanner extends Scanner {
row.crc.match(/^[0-9a-f]{8}$/) !== null
|| row.md5.match(/^[0-9a-f]{32}$/) !== null
|| row.sha1.match(/^[0-9a-f]{40}$/) !== null
|| row.sha256.match(/^[0-9a-f]{64}$/) !== null
))
.on('error', reject)
.on('data', (row) => {
Expand Down Expand Up @@ -456,6 +458,7 @@ export default class DATScanner extends Scanner {
&& (rom.getCrc32() === undefined || rom.getCrc32() !== '00000000')
&& (rom.getMd5() === undefined || rom.getMd5() !== 'd41d8cd98f00b204e9800998ecf8427e')
&& (rom.getSha1() === undefined || rom.getSha1() !== 'da39a3ee5e6b4b0d3255bfef95601890afd80709')
&& (rom.getSha256() === undefined || rom.getSha256() !== 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855')
));
return game.withProps({ rom: roms });
});
Expand Down
2 changes: 2 additions & 0 deletions src/types/dats/dat.ts
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,8 @@ export default abstract class DAT {
checksumBitmask |= ChecksumBitmask.MD5;
} else if (rom.getSha1()) {
checksumBitmask |= ChecksumBitmask.SHA1;
} else if (rom.getSha256()) {
checksumBitmask |= ChecksumBitmask.SHA256;
}
}));
return checksumBitmask;
Expand Down
12 changes: 11 additions & 1 deletion src/types/dats/rom.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ export default class ROM implements ROMProps {
@Expose()
readonly sha1?: string;

@Expose()
readonly sha256?: string;

@Expose()
readonly status?: ROMStatus;

Expand All @@ -49,6 +52,7 @@ export default class ROM implements ROMProps {
this.crc32 = props?.crc32?.toLowerCase().replace(/^0x/, '').padStart(8, '0');
this.md5 = props?.md5?.toLowerCase().replace(/^0x/, '').padStart(32, '0');
this.sha1 = props?.sha1?.toLowerCase().replace(/^0x/, '').padStart(40, '0');
this.sha256 = props?.sha256?.toLowerCase().replace(/^0x/, '').padStart(64, '0');
this.status = props?.status;
this.merge = props?.merge;
this.bios = props?.bios;
Expand All @@ -65,6 +69,7 @@ export default class ROM implements ROMProps {
crc: this.getCrc32(),
md5: this.getMd5(),
sha1: this.getSha1(),
sha256: this.getSha256(),
status: this.getStatus(),
},
};
Expand Down Expand Up @@ -92,6 +97,10 @@ export default class ROM implements ROMProps {
return this.sha1;
}

getSha256(): string | undefined {
return this.sha256;
}

getStatus(): ROMStatus | undefined {
return this.status;
}
Expand Down Expand Up @@ -139,7 +148,8 @@ export default class ROM implements ROMProps {
* A string hash code to uniquely identify this {@link ROM}.
*/
hashCode(): string {
return this.getSha1()
return this.getSha256()
?? this.getSha1()
?? this.getMd5()
?? `${this.getCrc32()}|${this.getSize()}`;
}
Expand Down
11 changes: 11 additions & 0 deletions src/types/files/archives/archiveEntry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
let finalSha1WithoutHeader = archiveEntryProps.fileHeader
? archiveEntryProps.sha1WithoutHeader
: archiveEntryProps.sha1;
let finalSha256WithHeader = archiveEntryProps.sha256;
let finalSha256WithoutHeader = archiveEntryProps.fileHeader
? archiveEntryProps.sha256WithoutHeader
: archiveEntryProps.sha256;
let finalSymlinkSource = archiveEntryProps.symlinkSource;

if (await fsPoly.exists(archiveEntryProps.archive.getFilePath())) {
Expand All @@ -61,6 +65,7 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
if ((!finalCrcWithHeader && (checksumBitmask & ChecksumBitmask.CRC32))
|| (!finalMd5WithHeader && (checksumBitmask & ChecksumBitmask.MD5))
|| (!finalSha1WithHeader && (checksumBitmask & ChecksumBitmask.SHA1))
|| (!finalSha256WithHeader && (checksumBitmask & ChecksumBitmask.SHA256))
) {
// If any additional checksum needs to be calculated, then prefer those calculated ones
// over any that were supplied in {@link archiveEntryProps} that probably came from the
Expand All @@ -73,6 +78,7 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
finalCrcWithHeader = headeredChecksums.crc32 ?? finalCrcWithHeader;
finalMd5WithHeader = headeredChecksums.md5 ?? finalMd5WithHeader;
finalSha1WithHeader = headeredChecksums.sha1 ?? finalSha1WithHeader;
finalSha256WithHeader = headeredChecksums.sha256 ?? finalSha256WithHeader;
}
if (archiveEntryProps.fileHeader && checksumBitmask) {
const headerlessChecksums = await this.calculateEntryChecksums(
Expand All @@ -84,6 +90,7 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
finalCrcWithoutHeader = headerlessChecksums.crc32;
finalMd5WithoutHeader = headerlessChecksums.md5;
finalSha1WithoutHeader = headerlessChecksums.sha1;
finalSha256WithoutHeader = headerlessChecksums.sha256;
}

if (await fsPoly.isSymlink(archiveEntryProps.archive.getFilePath())) {
Expand All @@ -96,6 +103,7 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
finalCrcWithoutHeader = finalCrcWithoutHeader ?? finalCrcWithHeader;
finalMd5WithoutHeader = finalMd5WithoutHeader ?? finalMd5WithHeader;
finalSha1WithoutHeader = finalSha1WithoutHeader ?? finalSha1WithHeader;
finalSha256WithoutHeader = finalSha256WithoutHeader ?? finalSha256WithHeader;

return new ArchiveEntry<A>({
size: finalSize,
Expand All @@ -105,6 +113,8 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
md5WithoutHeader: finalMd5WithoutHeader,
sha1: finalSha1WithHeader,
sha1WithoutHeader: finalSha1WithoutHeader,
sha256: finalSha256WithHeader,
sha256WithoutHeader: finalSha256WithoutHeader,
symlinkSource: finalSymlinkSource,
fileHeader: archiveEntryProps.fileHeader,
patch: archiveEntryProps.patch,
Expand Down Expand Up @@ -245,6 +255,7 @@ export default class ArchiveEntry<A extends Archive> extends File implements Arc
crc32WithoutHeader: this.getCrc32(),
md5WithoutHeader: this.getMd5(),
sha1WithoutHeader: this.getSha1(),
sha256WithoutHeader: this.getSha256(),
});
}

Expand Down
2 changes: 1 addition & 1 deletion src/types/files/archives/rar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ export default class Rar extends Archive {
entryPath: fileHeader.name,
size: fileHeader.unpSize,
crc32: fileHeader.crc.toString(16),
// If MD5 or SHA1 is desired, this file will need to be extracted to calculate
// If MD5, SHA1, or SHA256 is desired, this file will need to be extracted to calculate
}, checksumBitmask);
callback(undefined, archiveEntry);
},
Expand Down
2 changes: 1 addition & 1 deletion src/types/files/archives/sevenZip.ts
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ export default class SevenZip extends Archive {
entryPath: result.name,
size: Number.parseInt(result.size, 10),
crc32: result.crc,
// If MD5 or SHA1 is desired, this file will need to be extracted to calculate
// If MD5, SHA1, or SHA256 is desired, this file will need to be extracted to calculate
}, checksumBitmask);
callback(undefined, archiveEntry);
},
Expand Down
Loading

0 comments on commit 49d1e7e

Please sign in to comment.