diff --git a/LICENSE b/LICENSE index 91775a0..c045274 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2022 nukemiko +Copyright (c) 2023 nukemiko Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/MANIFEST.in b/MANIFEST.in index 1275a9d..2cdbc0a 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -1 +1,2 @@ -include src/libtakiyasha/qmc/binaries/* +recursive-include src/libtakiyasha/binaries/* +include MANIFEST.in src/libtakiyasha/VERSION diff --git a/README.md b/README.md index f90578d..be2ed11 100644 --- a/README.md +++ b/README.md @@ -1,141 +1,79 @@ -# libtakiyasha ![](https://img.shields.io/badge/Version-2.0.1-green) ![](https://img.shields.io/badge/Python-3.8%2B-blue) +# LibTakiyasha ![](https://img.shields.io/badge/Python-3.8%2B-blue) -`libtakiyasha` 是一个 Python 音频加密/解密工具库(当然也可用于加密非音频数据),支持多种加密文件格式。 +LibTakiyasha 是一个 Python 音频加密/解密工具库(当然也可用于加密非音频数据),支持多种加密文件格式。**不提供任何命令行或图形界面支持。** -`libtakiyasha` 只是一个工具库,不提供任何命令行或图形界面支持。 - ---- +## 使用前必读 **本项目是以学习和技术研究的初衷创建的,修改、再分发时请遵循 [License](LICENSE)。** 本项目的设计灵感,以及部分解密方案,来源于: -- [Unlock Music - Web Edition](https://git.unlock-music.dev/um/web) -- [jixunmoe/qmc2](https://github.com/jixunmoe/qmc2) +- [Unlock Music Project - CLI Edition](https://git.unlock-music.dev/um/cli) +- [parakeet-rs/libparakeet](https://github.com/parakeet-rs/libparakeet) -**本项目不会内置任何解密所需的密钥。你需要自行寻找解密所需密钥或加密参数,在调用时作为参数传入。** +**本项目没有所谓的“默认密钥”或“内置密钥”,打开/保存任何类型的加密文件都需要你提供对应的密钥。你需要自行寻找解密所需密钥或加密参数,在调用时作为参数传入。** -你可以在内容提供商的应用程序中查找这些必需参数,或寻求同类项目以及他人的帮助。**但请不要在 Issues/讨论区向作者索要所谓“缺失”的“内置密钥”,你的此类想法不会被满足。** +你可以在内容提供商的应用程序中查找这些必需参数,或寻求同类项目以及他人的帮助,**但请不要在 Issues/讨论区直接向作者索要所谓“缺失”的“内置密钥”。** -**`libtakiyasha` 对输出数据的可用性(是否可以识别、播放等)不做任何保证。** +**LibTakiyasha 对输出数据的可用性(是否可以识别、播放等)不做任何保证。** --- ## 特性 -- 纯 Python 实现(包括所有依赖关系),无 C/C++ 扩展模块,跨平台可用 -- 支持多种加密文件格式的加密和解密 - -## 当前版本:[2.0.1](https://github.com/nukemiko/libtakiyasha/releases/tag/2.0.1) - -此版本为正式版,但仍有不完美之处。如果发现任何 `libtakiyasha` 自身的问题,欢迎[提交 Issue](https://github.com/nukemiko/libtakiyasha/issues)。 - -**`libtakiyasha` 2.x 版本和 1.x 版本之间的接口并不兼容,使用 1.x 版本的应用程序需要进行大量改造,才能使用 2.x 版本。** - -另外,`libtakiyasha` 3.x 版本正在开发,有兴趣者可以切换分支查看。 - -### 支持的格式 - -请在[此处](https://github.com/nukemiko/libtakiyasha/wiki/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A0%BC%E5%BC%8F%E5%92%8C%E6%89%80%E9%9C%80%E5%AF%86%E9%92%A5%E5%8F%82%E6%95%B0)查看。 - -### 兼容性 - -到目前为止(版本 2.0.1),`libtakiyasha` 已在以下 Python 实现中通过了测试: - -- [CPython(官方实现)](https://www.python.org) 3.8 至 3.10,可能支持 3.11 -- [Pyston](https://github.com/pyston/pyston) [2.3.5](https://github.com/pyston/pyston/releases/tag/pyston_2.3.5)(基于 CPython 3.8.12),其他版本或许也可用 -- [PyPy](https://www.pypy.org/) 7.3.9([CPython 3.8 兼容版本、CPython 3.9 兼容版本](https://downloads.python.org/pypy/)) - -**注意:`libtakiyasha` 所需的最低 Python 版本为 3.8,因为它使用的很多 Python 特性从 Python 3.8 开始才出现,这意味着使用更低的 Python 版本会出现大量不可预知的错误。** - -提示:在作者运行的测试中,CPython 实现是速度最慢的;PyPy 比 Pyston 快了大约两倍,比 CPython 快了接近五倍。 - -### 安装 - -- 运行命令:`pip install -U libtakiyasha==2.0.1` -- 或者前往 [GitHub 发布页](https://github.com/nukemiko/libtakiyasha/releases/tag/2.0.1) 下载安装 - -#### 所需依赖关系 - -- `pyaes` - AES 加解密支持 -- `setuptools` - 安装依赖 - -如果你是通过[上文提到的方式](#安装)安装的 `libtakiyasha`,这些依赖会被自动安装。 - -### 基本使用方法 - -提取加密文件里的音频内容: - -```python -from libtakiyasha.ncm import NCM -from libtakiyasha.qmc import QMCv2 - -... # 定义你提供的核心密钥 your_core_key、your_simple_key、your_mix_key1 和 your_mix_key2 - -# 打开 NCM 文件 -ncmfile = NCM.from_file('source.ncm', core_key=your_core_key) -target_file_format = ncm.ncm_tag.format +- 使用纯 Python 代码编写 + - **兼容 Python 3.8 及后续版本**,兼容多种 Python 解释器实现(见下文 [#性能测试](#性能测试)) + - 易于阅读,方便 Python 爱好者学习 + - (包括依赖库)无任何 C/C++ 扩展模块,跨平台性强 -with open('target_from_ncm.' + target_file_format, mode='wb') as fd: - # libtakiyasha 的所有透明加密文件对象(NCM、QMCv1、QMCv2、KGMorVPR、KWM 等)默认以固定大小的块(io.DEFAULT_BUFFER_SIZE)为单位进行迭代 - # 通过修改对象的 iter_mode 属性为 'line',可以使其以一行为单位进行迭代 - # 不过按行迭代会导致性能大幅下降,不推荐使用 - for block in ncmfile: - fd.write(block) +### 性能测试 -# 打开 QMCv2 文件 -qmcv2file = QMCv2.from_file('source.mflac', simple_key=your_simple_key) -target_file_format = 'flac' +由于 Python 语言自身原因,LibTakiyasha 相较于同类项目,运行速度较慢。因此我们使用不同解释器实现,对常用操作做了一些性能测试: -with open('target_from_mflac.' + target_file_format, mode='wb') as fd: - for block in qmcv2file: - fd.write(block) +| 操作 | 测试大小 | Python 3.10.9 (CPython) | Python 3.8.12 (Pyston 2.3.5) | Python 3.9.16 (PyPy 7.3.11) | +| :------------: | :------: | :---------------------: | :--------------------------: | :-------------------------: | +| NCM 加密 | 36.8 MiB | 4.159s | 2.159s | 1.366s | +| NCM 解密 | 36.8 MiB | 4.393s | 2.360s | 1.480s | +| QMCv1 加密 | 36.8 MiB | 3.841s | 2.116s | 1.594s | +| QMCv1 解密 | 36.8 MiB | 3.813s | 2.331s | 1.406s | +| QMCv2 掩码加密 | 36.8 MiB | 4.065s | 2.201s | 1.727s | +| QMCv2 掩码解密 | 36.8 MiB | 3.990s | 2.200s | 1.848s | +| QMCv2 RC4 加密 | 36.8 MiB | 12.820s | 5.596s | 2.717s | +| QMCv2 RC4 解密 | 36.8 MiB | 12.588s | 5.913s | 2.552s | +| KGM 解密 | 64.4 MiB | 49.014s | 22.053s | 8.376s | +| VPR 解密 | 87.9 MiB | 70.030s | 32.252s | 11.902s | -# 也可以打开来自 QQ 音乐 PC 客户端 18.57 及更新版本的 QMCv2 文件, -# 但需要正确的 mix_key1 和 mix_key2 参数 -qmcv2file_keyencv2 = QMCv2.from_file('source.mflac', simple_key=your_simple_key, mix_key1=your_mix_key1, mix_key2=your_mix_key2) -target_file_format = 'flac' +仅在你对速度有要求时,可以考虑在调用 LibTakiyasha 时使用 PyPy/Pyston 解释器。 -with open('target_from_mflac.' + target_file_format, mode='wb') as fd: - for block in qmcv2file_keyencv2: - fd.write(block) -``` +一般情况下,建议使用官方解释器实现(CPython)。 -- 打开加密文件时,如果不提供核心密钥,会报错而无法继续: +## 安装 - ```pycon +可用的最新版本:2.1.0rc1,可前往[发布页面](https://github.com/nukemiko/libtakiyasha/releases/tag/2.1.0rc1)或 [PyPI](https://pypi.org/project/libtakiyasha/2.1.0rc1/) 下载。 - >>> from libtakiyasha import QMCv2 - >>> qmcv2file = QMCv2.from_file('source.mflac') - Traceback (most recent call last): - File "", line 1, in - <...> - ValueError: 'simple_key' is required for QMCv2 file master key decryption - >>> - ``` +如果你要下载其他版本: - 你需要向 `QMCv2.from_file()` 传入正确的 `simple_key` 参数才能打开文件。 +- PyPI:https://pypi.org/project/libtakiyasha/#history ,挑选自己所需的版本,下载安装包,手动安装。 + - 或者使用 pip 安装:`python -m pip install -U libtakiyasha==<你所需的版本>` +- 前往[发布页面](https://github.com/nukemiko/libtakiyasha/releases)挑选自己所需的版本,下载安装包,手动安装。 - 同样,你需要向 `NCM.from_file()` 传入正确的 `core_key` 参数才能打开 NCM 文件。 +### 依赖项 -生成加密文件(以 QMCv2 为例): +LibTakiyasha 依赖以下包,均可从 PyPI 获取: -```python -from libtakiyasha.qmc import QMCv2 +- [pyaes](https://pypi.org/project/pyaes/) +- [mutagen](https://pypi.org/project/mutagen/) -... # 定义你的 your_simple_key、your_mix_key1 和 your_mix_key2 +## 常见问题 -new_qmcv2 = QMCv2.new() +> 为什么 2.x 打开文件需要密钥,而 1.x 版本不需要? -new_qmcv2.simple_key = your_simple_key # 可选,但如果跳过此步骤,在保存到文件时需要填写参数 simple_key +这是出于以下考虑: -with open('plain.flac', 'rb') as fd: - for line in fd: - new_qmcv2.write(line) +- LibTakiyasha 是一个加解密库,当然需要为用户提供自定义密钥的权利 +- 为了保护本项目不受美国数字千年版权法Digital Millennium Copyright Act(DMCA)影响,避免仓库被误杀 + - 因此,本仓库所有 1.x 及更早版本的提交和发布版本都已删除。 -# 保存为 QMCv2 KeyEncV1 -new_qmcv2.to_file('encrypted.mflac') +> 如何使用? -# 也可以保存为 QMCv2 KeyEncV2 - QQ 音乐 PC 端 18.57 及更高版本的格式 -new_qmcv2.to_file('encrypted-keyencv2.mflac', master_key_enc_ver=2, mix_key1=your_mix_key1, mix_key2=your_mix_key2) -``` +LibTakiyasha 的文档(DocStrings)写得非常清晰,你可以在导入后,使用 Python 内置函数 `help(<...>)` 查看用法。 diff --git a/examples/dump-ncm-file-with-key.py b/examples/dump-ncm-file-with-key.py deleted file mode 100755 index e8423ab..0000000 --- a/examples/dump-ncm-file-with-key.py +++ /dev/null @@ -1,286 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -from __future__ import annotations - -import argparse -import os -import re -import shutil -import sys -from datetime import datetime -from pathlib import Path - -from libtakiyasha.exceptions import LibTakiyashaException -from libtakiyasha.ncm import NCM - -try: - from mutagen import flac, mp3, id3 -except ImportError: - mutagen_available = False -else: - mutagen_available = True - -try: - from tqdm import tqdm -except ImportError: - tqdm_available = False -else: - tqdm_available = True - -progname = Path(sys.argv[0]).name - -template_string_pattern = re.compile('{title}|{artist}|{album}|{time}') - -hexnumstr_array_sep_pattern = re.compile(', ?|,? ') -hexnumstr_pattern = re.compile('^0x[0-9a-z]{,2}$', flags=re.IGNORECASE) - -if sys.platform.startswith('win'): - illegal_filename_chars_pattern = re.compile(r'[\x00-\x31~"#%&*:<>?/\\|]+') -else: - illegal_filename_chars_pattern = re.compile(r'[\x00/]') - - -def hexstring2bytes(hexstring: str, paramname: str) -> bytes: - try: - return bytes.fromhex(hexstring) - except ValueError: - hexnum_strings = hexnumstr_array_sep_pattern.split(hexstring) - hexnums: list[int] = [] - for item in hexnum_strings: - try: - hexnums.append(int(item, base=16)) - except ValueError: - print(f"错误:在参数 '{paramname}' 中发现无效的十六进制数字 '{item}'") - sys.exit(1) - return bytes(hexnums) - - -def str_shorten(s, maxlen: int = 30, lr_maxkeeplen: int = 10) -> str: - string = str(s) - maxlen = int(maxlen) - lr_maxkeeplen = int(lr_maxkeeplen) - - if len(string) > maxlen: - return string[:lr_maxkeeplen] + '...' + string[-lr_maxkeeplen:] - return string - - -ap = argparse.ArgumentParser(prog=progname, - add_help=False, - formatter_class=argparse.RawTextHelpFormatter, - usage='%(prog)s [-h] [-t 模板| --template 模板] ' - '(-k 核心密钥 | --core-key 核心密钥) ' - 'NCM 文件... ' - '[输出目录]' - ) -required_optargs = ap.add_argument_group('必需选项和参数') -required_optargs.add_argument('-k', '--core-key', - dest='core_key_str', - metavar='核心密钥', - required=True, - help="解密文件所需的密钥,使用十六进制表示法。\n" - "可以接受以下形式:\n" - " -k a1b1c4d5e1f4\n" - " -k 0x11,0x45,0x14,0Xc1,0x9F,0x19,0xab\n" - "不区分大小写,包含的空格会被去除。" - ) -required_optargs.add_argument('sources_or_target', - metavar='NCM 文件... [输出目录]', - nargs='+', - type=Path, - help='所有输入文件的路径。如果最后一个路径指向一个目录,那么它会被用作\n' - '所有输入文件的输出目录;否则,输出目录为当前目录。\n' - '除最后一个参数外,所有路径必须指向一个文件。' - ) - -optional_optargs = ap.add_argument_group('可选选项和参数') -optional_optargs.add_argument('-h', '--help', - action='help', - help='显示帮助信息并退出' - ) -optional_optargs.add_argument('-t', '--template', - dest='target_filename_template', - metavar='模板', - default='', - help='以 <模板> 规定的格式设定输出文件的名称。\n' - '模板字符串中的可用字段:\n' - ' {title} - 标题\n' - ' {artist} - 歌手(艺术家)\n' - ' {album} - 专辑\n' - ' {time} - 文件生成的时间,使用 ISO 8601 表示法\n' - "例如,将歌曲标题作为输出文件名:-t '{title}'\n" - '如果未指定此选项,那么将会根据源文件名决定输出文件名。' - ) -optional_optargs.add_argument('-n', '--no-tag', - action='store_false', - dest='with_tag', - help='不要向输出文件中写入标签信息' - ) - - -def main(): - optargs = ap.parse_intermixed_args() - - core_key_str: str = optargs.core_key_str - core_key = hexstring2bytes(core_key_str, '-k/--core-key') - - sources_or_target: list[Path] = optargs.sources_or_target - targetdir = sources_or_target.pop(-1) - if not targetdir.is_dir(): - sources_or_target.append(targetdir) - targetdir = Path.cwd() - - target_filename_template = optargs.target_filename_template - if target_filename_template != '' and not template_string_pattern.search(target_filename_template): - print(f"错误:文件名模板字符串 '{target_filename_template}' 不包含任何字段") - sys.exit(1) - target_filename_template = illegal_filename_chars_pattern.sub( - '%%', target_filename_template - ) - - with_tag: bool = optargs.with_tag - - total = len(sources_or_target) - succeeds: list[tuple[Path, Path]] = [] - fails: list[tuple[Path, str]] = [] - - for current, sourcepath in enumerate(sources_or_target, start=1): - sourcepath_dirname = sourcepath.parent - sourcepath_filename = sourcepath.name - sourcepath_display = os.path.join(str_shorten(sourcepath_dirname), str_shorten(sourcepath_filename)) - - termcols, termls = shutil.get_terminal_size() - print('=' * termcols) - print(f"[{current}/{total}]输入文件:'{sourcepath_display}'") - - errmsg = '' - if not sourcepath.exists(): - errmsg = '路径不存在' - elif not sourcepath.is_file(): - errmsg = '路径不是一个文件' - if errmsg: - print(f"跳过 '{sourcepath_display}':{errmsg}") - fails.append((sourcepath, errmsg)) - continue - - try: - ncmfile = NCM.from_file(sourcepath, core_key=core_key) - ncmfile_size = ncmfile.seek(0, 2) - ncmfile.seek(0, 0) - except LibTakiyashaException as exc: - errmsg = f'{type(exc).__name__}: {exc}' - print(f"跳过 '{sourcepath}':{errmsg}") - fails.append((sourcepath, errmsg)) - continue - - ncm_tag = ncmfile.ncm_tag - targetfile_format = ncm_tag.format.upper() - if not targetfile_format: - header_4bytes = ncmfile.read(4) - if header_4bytes.startswith(b'fLaC'): - targetfile_format = 'FLAC' - elif header_4bytes.startswith((b'ID3', b'\xff\xfb', b'\xff\xf3', b'\xff\xf2')): - targetfile_format = 'MP3' - print(f"输出文件格式:{targetfile_format.strip() if targetfile_format.strip() else '未知'}") - - title = str(ncm_tag.musicName) if ncm_tag.musicName else None - if ncm_tag.artist: - artist_names_ids = list(ncm_tag.artist) - artist_names: list[str] = [] - for item in artist_names_ids: - if len(item) < 1: - continue - artist_names.append(str(item[0])) - artist = '、'.join(artist_names) - else: - artist = None - album = str(ncm_tag.album) if ncm_tag.album else None - - print(f"标题:{title if title else '无'}") - print(f"歌手:{artist if artist else '无'}") - print(f"专辑:{album if album else '无'}") - - targetpath_filename = sourcepath.stem + f'.{targetfile_format.lower() if targetfile_format.strip() else "unknown"}' - if target_filename_template: - targetpath_filename = target_filename_template.format( - title=title, - artist=artist, - album=album, - time=datetime.now().isoformat(timespec='seconds') - ) + f'.{targetfile_format.lower() if targetfile_format.strip() else "unknown"}' - targetpath_filename_display = str_shorten(targetpath_filename) - targetpath = targetdir / targetpath_filename - print(f"输出文件名:{targetpath_filename}") - print(f"输出文件所在目录:{targetdir}") - - if tqdm_available: - with tqdm(total=ncmfile_size, - unit='B', - unit_scale=True, - desc=targetpath_filename_display - ) as pbar: - with open(targetpath, 'wb') as targetfile: - for blk in ncmfile: - targetfile.write(blk) - pbar.update(len(blk)) - else: - with open(targetpath, 'wb') as targetfile: - for blk in ncmfile: - targetfile.write(blk) - print('音频数据已取出') - - if with_tag: - if mutagen_available: - try: - cover_data = ncmfile.cover_data - metadata = ncm_tag.to_mutagen_style_dict() - if targetfile_format.upper() == 'FLAC': - tag = flac.FLAC(targetpath) - for key, value in metadata.items(): - tag[key] = value - picture = flac.Picture() - picture.data = cover_data - picture.type = 3 - if cover_data.startswith(b'\x89PNG'): - picture.mime = 'image/png' - elif cover_data.startswith(b'\xff\xd8\xff'): - picture.mime = 'image/jpeg' - tag.add_picture(picture) - tag.save(targetpath) - elif targetfile_format.upper() == 'MP3': - tag = mp3.MP3(targetpath) - for key, value in metadata.items(): - id3frame_cls = getattr(id3, key[:4]) - id3frame = tag.get(key) - if id3frame is None: - tag[key] = id3frame_cls(text=value, desc='comment') - elif id3frame.text: - id3frame.text = value - tag[key] = id3frame - picture = id3.APIC() - picture.data = cover_data - picture.type = 3 - if cover_data.startswith(b'\x89PNG'): - picture.mime = 'image/png' - elif cover_data.startswith(b'\xff\xd8\xff'): - picture.mime = 'image/jpeg' - tag['APIC:'] = picture - tag.save(targetpath) - else: - print('未嵌入标签信息,因为输出文件格式未知') - except Exception as exc: - print(f'未能嵌入标签信息:{type(exc).__name__}: {exc}') - else: - print('已嵌入标签信息') - else: - print("未嵌入标签信息,因为缺少依赖关系“mutagen”") - - succeeds.append((sourcepath, targetpath)) - - if current == total: - termcols, termls = shutil.get_terminal_size() - print('=' * termcols) - - -if __name__ == '__main__': - main() diff --git a/pyproject.toml b/pyproject.toml index 8d5065e..082dc21 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,13 +4,15 @@ dynamic = ["dependencies", "readme", "version"] authors = [ { name = "nukemiko" }, ] -description = "Python 音乐加解密工具库" +description = "多种加密方案的 Python 实现" license = { file = "LICENSE" } requires-python = ">=3.8" classifiers = [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Topic :: Multimedia :: Sound/Audio", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10" @@ -24,8 +26,8 @@ keywords = ["unlock", "music", "audio", "qmc", "ncm", "mflac", "mgg", "netease", "Releases" = "https://github.com/nukemiko/libtakiyasha/releases" [build-system] -requires = ["setuptools >= 46.4.0"] build-backend = "setuptools.build_meta" +requires = ["setuptools >= 46.4.0"] [tool.setuptools] include-package-data = true diff --git a/requirements.txt b/requirements.txt index 62f988c..67db29a 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,2 @@ pyaes -setuptools +mutagen diff --git a/src/libtakiyasha/VERSION b/src/libtakiyasha/VERSION index 38f77a6..0c271bc 100644 --- a/src/libtakiyasha/VERSION +++ b/src/libtakiyasha/VERSION @@ -1 +1 @@ -2.0.1 +2.1.0rc1 diff --git a/src/libtakiyasha/__init__.py b/src/libtakiyasha/__init__.py index 8eb391c..0577d97 100644 --- a/src/libtakiyasha/__init__.py +++ b/src/libtakiyasha/__init__.py @@ -1,4 +1,5 @@ # -*- coding: utf-8 -*- from __future__ import annotations +from . import kgmvpr, kwm, ncm, qmc from .pkgver import progname, version, version_info diff --git a/src/libtakiyasha/binaries/ncm/empty.flac b/src/libtakiyasha/binaries/ncm/empty.flac new file mode 100644 index 0000000..14ae8f6 Binary files /dev/null and b/src/libtakiyasha/binaries/ncm/empty.flac differ diff --git a/src/libtakiyasha/qmc/binaries/Key256MappingData b/src/libtakiyasha/binaries/qmc/qmcconsts/Key256MappingData similarity index 100% rename from src/libtakiyasha/qmc/binaries/Key256MappingData rename to src/libtakiyasha/binaries/qmc/qmcconsts/Key256MappingData diff --git a/src/libtakiyasha/common.py b/src/libtakiyasha/common.py deleted file mode 100644 index bf6a620..0000000 --- a/src/libtakiyasha/common.py +++ /dev/null @@ -1,527 +0,0 @@ -# -*- coding: utf-8 -*- -from __future__ import annotations - -from abc import ABCMeta, abstractmethod -from functools import lru_cache -from typing import Generator, Iterable, Literal - -from .miscutils import bytestrxor - -try: - import io -except ImportError: - import _pyio as io - -from .typedefs import IntegerLike, BytesLike, WritableBuffer -from .typeutils import tobytes, toint_nofloat - -__all__ = [ - 'CipherSkel', - 'StreamCipherSkel', - 'CryptLayerWrappedIOSkel' -] - - -class CipherSkel(metaclass=ABCMeta): - """适用于一般加密算法的框架类。子类必须实现 ``encrypt()`` 和 ``decrypt()`` 方法。""" - - @abstractmethod - def encrypt(self, plaindata: BytesLike, /) -> bytes: - """加密明文 ``plaindata`` 并返回加密结果。 - - Args: - plaindata: 要加密的明文 - """ - raise NotImplementedError - - @abstractmethod - def decrypt(self, cipherdata: BytesLike, /) -> bytes: - """解密密文 ``cipherdata`` 并返回解密结果。 - - Args: - cipherdata: 要解密的密文 - """ - raise NotImplementedError - - -class StreamCipherSkel(metaclass=ABCMeta): - """适用于简单流式加密算法的框架类。子类必须实现 ``keystream()`` 方法。""" - - @abstractmethod - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - """返回一个生成器对象,对其进行迭代,即可得到从起始点 - ``offset`` 开始,持续一定长度 ``length`` 的密钥流。 - - Args: - offset: 密钥流的起始点,不应为负数 - length: 密钥流的长度,不应为负数 - """ - raise NotImplementedError - - def encrypt(self, plaindata: BytesLike, offset: IntegerLike = 0, /) -> bytes: - """加密明文 ``plaindata`` 并返回加密结果。 - - Args: - plaindata: 要加密的明文 - offset: 明文在文件中的位置(偏移量),不应为负数 - """ - plaindata = tobytes(plaindata) - offset = toint_nofloat(offset) - - return bytestrxor(plaindata, self.keystream(offset, len(plaindata))) - - def decrypt(self, cipherdata: BytesLike, offset: IntegerLike = 0, /) -> bytes: - """解密密文 ``cipherdata`` 并返回解密结果。 - - Args: - cipherdata: 要解密的密文 - offset: 密文在文件中的位置(偏移量),不应为负数 - """ - cipherdata = tobytes(cipherdata) - offset = toint_nofloat(offset) - - return bytestrxor(cipherdata, self.keystream(offset, len(cipherdata))) - - -class CryptLayerWrappedIOSkel(io.BytesIO): - """基于 BytesIO 的透明加密二进制流。 - - 所有读写相关方法都会经过透明加密层处理: - 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 - - 调用读写相关方法时,附加参数 ``nocryptlayer=True`` - 可绕过透明加密层,访问缓冲区内的原始加密数据。 - - ``__init__()`` 方法的第一个位置参数 ``cipher`` 必须拥有 - ``encrypt()``、``decrypt()`` 和 ``keystream()`` 方法,且这些方法必须能接受两个位置参数。 - 其中,``encrypt()`` 和 ``decrypt()`` 的第一个位置参数接受字节对象,第二个位置参数接受非负整数; - ``keystream()`` 的两个位置参数均只接受非负整数。 - - 如果 ``cipher`` 未实现这些方法中的任何一个,都需要明确抛出 ``NotImplementedError``。 - 未实现的 ``encrypt()``/``decrypt()`` 方法会导致创建的对象不可通过透明加密层读/写; - 未实现的 ``keystream()`` 方法不会影响对读写的支持,但可能会极大影响读取的速度。 - - ``__init__()`` 方法的第二个参数 ``initial_bytes`` - 会在转换为 ``bytes`` 后作为对象内置缓冲区的初始数据。 - - 基于本类的子类可能拥有自己的构造器方法或函数,而不是直接调用 - ``__init__()``;详情请参考该类的文档字符串。 - - 本类和基于本类的子类,同时兼容 ``IO[bytes]`` - 和 ``typedefs.StreamCipherBasedCryptedIOProto`` 类型。 - """ - - @property - def name(self) -> str | None: - """当前对象来源文件的路径。 - - 在此类的对象中,此属性总是 ``None``。 - - 如果是通过子类的构造器方法或函数创建的对象,此属性可能会为来源文件的路径字符串。 - """ - if hasattr(self, '_name'): - name: str = self._name - return name - - @property - def encryptable(self) -> bool: - """此对象的内置透明加密层是否支持加密(内置 ``Cipher`` 对象的 ``encrypt()`` 方法是否可用)。 - - 这会影响到写入相关方法在参数 ``nocryptlayer=False`` 时是否可用。 - """ - return self._encrypt_available - - @property - def decryptable(self) -> bool: - """此对象的内置透明加密层是否支持解密(内置 ``Cipher`` 对象的 ``decrypt()`` 方法是否可用)。 - - 这会影响到读取相关方法在参数 ``nocryptlayer=False`` 时是否可用,以及此对象是否可迭代。 - """ - return self._decrypt_available - - @property - def iter_nocryptlayer(self) -> bool: - """迭代当前对象时,是否需要绕过透明加密层。默认为 ``False``。""" - return self._iter_nocryptlayer - - @iter_nocryptlayer.setter - def iter_nocryptlayer(self, value: bool): - self._iter_nocryptlayer = bool(value) - - @property - def iter_mode(self) -> Literal['block', 'line']: - """迭代的模式,只能设置为 ``block`` 或 ``line``: - - - ``block``(默认值)- 以块为单位进行迭代:每次迭代时,返回等长的“一块”数据。 - - 每次迭代返回的数据长度由 ``self.iter_block_size`` 决定。 - - ``line`` - 以一行为单位进行迭代:每次迭代时,返回的数据都以 ``b'\\n'`` 结尾。 - - 此模式会极大降低迭代的速度,不推荐使用。 - - 尝试设置为其他值会触发 ``ValueError`` 或 ``TypeError``。 - """ - return self._iter_mode - - @iter_mode.setter - def iter_mode(self, value: Literal['block', 'line']) -> None: - if value in ('block', 'line'): - self._iter_mode = value - elif isinstance(value, str): - raise ValueError(f"attribute 'iter_mode' must be 'block' or 'line', not '{value}'") - else: - raise TypeError(f"attribute 'iter_mode' must be str, not {type(value).__name__}") - - @property - def iter_block_size(self) -> int: - """以块为单位进行迭代时,每次迭代返回的数据长度。 - - 如果尝试设置为负数,会触发 ``ValueError``。 - - 本属性不会影响以一行为单位进行的迭代。 - """ - return self._iter_block_size - - @iter_block_size.setter - def iter_block_size(self, value: IntegerLike) -> None: - size = toint_nofloat(value) - if size < 0: - raise ValueError("attribute 'iter_block_size' cannot be a negative integer") - self._iter_block_size = size - - def __init__(self, cipher, /, initial_bytes: BytesLike = b'') -> None: - """基于 BytesIO 的透明加密二进制流。 - - 所有读写相关方法都会经过透明加密层处理: - 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 - - 调用读写相关方法时,附加参数 ``nocryptlayer=True`` - 可绕过透明加密层,访问缓冲区内的原始加密数据。 - - ``__init__()`` 方法的第一个位置参数 ``cipher`` 必须拥有 - ``encrypt()``、``decrypt()`` 和 ``keystream()`` 方法,且这些方法必须能接受两个位置参数。 - 其中,``encrypt()`` 和 ``decrypt()`` 的第一个位置参数接受字节对象,第二个位置参数接受非负整数; - ``keystream()`` 的两个位置参数均只接受非负整数。 - - 如果 ``cipher`` 未实现这些方法中的任何一个,都需要明确抛出 ``NotImplementedError``。 - 未实现的 ``encrypt()``/``decrypt()`` 方法会导致创建的对象不可通过透明加密层读/写; - 未实现的 ``keystream()`` 方法不会影响对读写的支持,但可能会极大影响读取的速度。 - - ``__init__()`` 方法的第二个参数 ``initial_bytes`` - 会在转换为 ``bytes`` 后作为对象内置缓冲区的初始数据。 - - 基于本类的子类可能拥有自己的构造器方法或函数,而不是直接调用 - ``__init__()``;详情请参考该类的文档字符串。 - - 本类和基于本类的子类,同时兼容 ``IO[bytes]`` - 和 ``typedefs.StreamCipherBasedCryptedIOProto`` 类型。 - """ - super().__init__(tobytes(initial_bytes)) - - for method_name in 'keystream', 'encrypt', 'decrypt': - try: - method = getattr(cipher, method_name) - except Exception as exc: - if hasattr(cipher, method_name): - raise exc - else: - raise TypeError(f"{repr(cipher)} is not a StreamCipher object: " - f"method '{method_name}' is missing" - ) - if not callable(method): - raise TypeError(f"{repr(cipher)} is not a StreamCipher object: " - f"method '{method_name}' is not callable" - ) - # 检测 keystream() 是否已实现(可用) - self._keystream_available = True - try: - cipher.keystream(0, 0) - except NotImplementedError: - self._keystream_available = False - # 检测 encrypt() 是否已实现(可用) - self._encrypt_available = True - try: - cipher.encrypt(b'', 0) - except NotImplementedError: - self._encrypt_available = False - # 检测 decrypt() 是否已实现(可用) - self._decrypt_available = True - try: - cipher.decrypt(b'', 0) - except NotImplementedError: - self._decrypt_available = False - - self._cipher = cipher - self._iter_nocryptlayer = False - self._iter_mode: Literal['block', 'line'] = 'block' - self._iter_block_size: int = io.DEFAULT_BUFFER_SIZE - - def __iter__(self): - return self - - def __next__(self) -> bytes: - if self._iter_mode == 'line': - if self._iter_nocryptlayer: - return super().__next__() - elif not self._decrypt_available: - raise io.UnsupportedOperation('iter with crypt layer') - else: - curpos = self.tell() - - target_data = super().getvalue()[curpos:] - if self._keystream_available: - result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=b'\n')) - else: - result_data = bytearray() - start = curpos - while 1: - stop = start + self._iter_block_size - target_data_segment = target_data[start:stop] - if target_data_segment == b'': - break - d = self._cipher.decrypt(target_data_segment, start) - if b'\n' in d: - result_data.append(d[:d.index(b'\n')]) - break - else: - result_data.append(d) - start += self._iter_block_size - if result_data == b'': - raise StopIteration - - self.seek(curpos + len(result_data), 0) - - return result_data - elif self._iter_mode == 'block': - if not self._decrypt_available: - raise io.UnsupportedOperation('iter with crypt layer') - - curpos = self.tell() - - target_data = super().getvalue()[curpos:curpos + self._iter_block_size] - if self._iter_nocryptlayer: - result_data = target_data - elif self._keystream_available: - result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=None)) - else: - result_data = bytes(self._cipher.decrypt(target_data, curpos)) - - if result_data == b'': - raise StopIteration - - self.seek(curpos + len(result_data), 0) - - return result_data - elif isinstance(self._iter_mode, str): - raise ValueError(f"attribute 'iter_mode' must be 'block' or 'line', not '{self._iter_mode}'") - else: - raise TypeError(f"attribute 'iter_mode' must be str, not {type(self._iter_mode).__name__}") - - @lru_cache - def __repr__(self) -> str: - repr_strings = [ - f'<{type(self).__module__}.{type(self).__name__} object', - f' at {hex(id(self))}', - f', cipher={repr(self._cipher)}' - ] - if self.name is not None: - repr_strings.append(f", from '{self.name}'") - repr_strings.append('>') - - return ''.join(repr_strings) - - def _xor_data_keystream(self, - offset: int, - data: bytes, - eof: bytes = None - ) -> Generator[int, None, None]: - if eof is None: - eoford = None - else: - eoford = ord(tobytes(eof)) - - if data == b'': - return - - keystream = self._cipher.keystream(offset, len(data)) - for databyteord, streambyteord in zip(data, keystream): - resultbyteord = databyteord ^ streambyteord - yield resultbyteord - if resultbyteord == eoford: - return - - def getvalue(self, nocryptlayer: bool = False) -> bytes: - if nocryptlayer: - return super().getvalue() - elif not self._decrypt_available: - raise io.UnsupportedOperation('getvalue with crypt layer') - else: - return self._cipher.decrypt(super().getvalue()) - - def getbuffer(self, nocryptlayer: bool = False) -> memoryview: - if nocryptlayer: - return super().getbuffer() - else: - raise NotImplementedError('memoryview with crypt layer is not supported') - - def read(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: - if nocryptlayer: - return super().read(size) - elif not self._decrypt_available: - raise io.UnsupportedOperation('read with crypt layer') - else: - curpos = self.tell() - if size is None: - size = -1 - size = toint_nofloat(size) - if size < 0: - target_data = super().getvalue()[curpos:] - else: - target_data = super().getvalue()[curpos:curpos + size] - - if self._keystream_available: - result_data = bytes(self._xor_data_keystream(curpos, target_data)) - else: - result_data = self._cipher.decrypt(target_data, curpos) - self.seek(curpos + len(result_data), 0) - - return result_data - - def readinto(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: - if nocryptlayer: - return super().readinto(buffer) - elif not self._decrypt_available: - raise io.UnsupportedOperation('readinto with crypt layer') - else: - if isinstance(buffer, memoryview): - memview = buffer - else: - memview = memoryview(buffer) - memview = memview.cast('B') - - data = self.read(len(memview)) - data_len = len(data) - - memview[:data_len] = data - - return data_len - - def read1(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: - if nocryptlayer or self._decrypt_available: - return self.read(size, nocryptlayer) - - raise io.UnsupportedOperation('read1 with crypt layer') - - def readinto1(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: - if nocryptlayer or self._decrypt_available: - return self.readinto(buffer, nocryptlayer) - - raise io.UnsupportedOperation('readinto1 with crypt layer') - - def readblock(self, - size: IntegerLike | None = -1, /, - nocryptlayer: bool = False, *, - block_size: IntegerLike | None = io.DEFAULT_BUFFER_SIZE - ) -> bytes: - if not self._decrypt_available: - raise io.UnsupportedOperation('readblock with crypt layer') - - curpos = self.tell() - if size is None: - size = -1 - size = toint_nofloat(size) - if block_size is None: - block_size = io.DEFAULT_BUFFER_SIZE - block_size = toint_nofloat(block_size) - if block_size < 0: - block_size = io.DEFAULT_BUFFER_SIZE - if size < 0: - target_data = super().getvalue()[curpos:curpos + block_size] - else: - target_data = super().getvalue()[curpos:curpos + min([size, block_size])] - - if nocryptlayer: - result_data = target_data - elif self._keystream_available: - result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=None)) - else: - result_data = self._cipher.decrypt(target_data, curpos) - - self.seek(curpos + len(result_data), 0) - - return result_data - - def readline(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: - if nocryptlayer: - return super().readline(size) - else: - if not self._decrypt_available: - raise io.UnsupportedOperation('readline with crypt layer') - curpos = self.tell() - if size is None: - size = -1 - size = toint_nofloat(size) - if size < 0: - target_data = super().getvalue()[curpos:] - else: - target_data = super().getvalue()[curpos:curpos + size] - - if self._keystream_available: - result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=b'\n')) - else: - result_data = bytearray() - start = curpos - while 1: - stop = start + self._iter_block_size - target_data_segment = target_data[start:stop] - if target_data_segment == b'': - break - d = self._cipher.decrypt(target_data_segment, start) - if b'\n' in d: - result_data.append(d[:d.index(b'\n')]) - break - else: - result_data.append(d) - start += self._iter_block_size - self.seek(curpos + len(result_data), 0) - - return bytes(result_data) - - def readlines(self, hint: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> list[bytes]: - if nocryptlayer: - return super().readlines(hint) - elif not self._decrypt_available: - raise io.UnsupportedOperation('readlines with crypt layer') - else: - results_lines = [] - if hint is None: - hint = -1 - hint = toint_nofloat(hint) - if hint < 0: - while 1: - line = self.readline() - if line == b'': - return results_lines - results_lines.append(line) - else: - for _ in range(hint): - line = self.readline() - if line == b'': - return results_lines - results_lines.append(line) - - def write(self, data: BytesLike, /, nocryptlayer: bool = False) -> int: - if nocryptlayer: - return super().write(data) - elif not self._encrypt_available: - raise io.UnsupportedOperation('write with crypt layer') - else: - curpos = self.tell() - return super().write(self._cipher.encrypt(data, curpos)) - - def writelines(self, lines: Iterable[BytesLike], /, nocryptlayer: bool = False) -> None: - if nocryptlayer: - return super().writelines(lines) - elif not self._encrypt_available: - raise io.UnsupportedOperation('writelines with crypt layer') - else: - for line in lines: - super().write(line) diff --git a/src/libtakiyasha/kgmvpr/__init__.py b/src/libtakiyasha/kgmvpr/__init__.py index 151118e..a8fc589 100644 --- a/src/libtakiyasha/kgmvpr/__init__.py +++ b/src/libtakiyasha/kgmvpr/__init__.py @@ -1,19 +1,96 @@ # -*- coding: utf-8 -*- from __future__ import annotations -from typing import IO, Literal +import warnings +from pathlib import Path +from typing import IO, NamedTuple -from .kgmvprdataciphers import KGMorVPREncryptAlgorithm -from ..common import CryptLayerWrappedIOSkel +from .kgmvprdataciphers import KGMCryptoLegacy from ..exceptions import CrypterCreatingError from ..keyutils import make_salt -from ..typedefs import BytesLike, FilePath -from ..typeutils import is_filepath, tobytes, verify_fileobj +from ..prototypes import EncryptedBytesIOSkel +from ..typedefs import BytesLike, FilePath, KeyStreamBasedStreamCipherProto, StreamCipherProto +from ..typeutils import isfilepath, tobytes, verify_fileobj -__all__ = ['KGMorVPR'] +warnings.filterwarnings(action='default', category=DeprecationWarning, module=__name__) +__all__ = ['KGMorVPR', 'probe_kgmvpr', 'KGMorVPRFileInfo'] -class KGMorVPR(CryptLayerWrappedIOSkel): + +class KGMorVPRFileInfo(NamedTuple): + cipher_data_offset: int + cipher_data_len: int + encryption_version: int + core_key_slot: int + core_key_test_data: bytes + master_key: bytes | None + is_vpr: bool + + +def probe_kgmvpr(filething: FilePath | IO[bytes], /) -> tuple[Path | IO[bytes], KGMorVPRFileInfo | None]: + """探测源文件 ``filething`` 是否为一个 KGM 或 VPR 文件。 + + 返回一个 2 个元素长度的元组:第一个元素为 ``filething``;如果 + ``filething`` 是 KGM 或 VPR 文件,那么第二个元素为一个 ``KGMorVPRFileInfo`` 对象;否则为 ``None``。 + + 如果 ``filething`` 是 VPR 文件,那么 ``KGMorVPRFileInfo`` + 对象的 ``is_vpr`` 属性为 ``True``;否则为 ``False``。 + + 本方法的返回值可以用于 ``KGMorVPR.open()`` 的第一个位置参数。 + + Args: + filething: 源文件的路径或文件对象 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 KGM 或 VPR 文件,那么第二个元素为一个 KGMorVPRFileInfo 对象;否则为 None。 + """ + + def operation(fd: IO[bytes]) -> KGMorVPRFileInfo | None: + total_size = fd.seek(0, 2) + if total_size < 60: + return + fd.seek(0, 0) + + header = fd.read(16) + if header == b'\x05\x28\xbc\x96\xe9\xe4\x5a\x43\x91\xaa\xbd\xd0\x7a\xf5\x36\x31': + is_vpr = True + elif header == b'\x7c\xd5\x32\xeb\x86\x02\x7f\x4b\xa8\xaf\xa6\x8e\x0f\xff\x99\x14': + is_vpr = False + else: + return + + cipher_data_offset = int.from_bytes(fd.read(4), 'little') + encryption_version = int.from_bytes(fd.read(4), 'little') + core_key_slot = int.from_bytes(fd.read(4), 'little') + core_key_test_data = fd.read(16) + master_key = fd.read(16) + + return KGMorVPRFileInfo( + cipher_data_offset=cipher_data_offset, + cipher_data_len=total_size - cipher_data_offset, + encryption_version=encryption_version, + core_key_slot=core_key_slot, + core_key_test_data=core_key_test_data, + master_key=master_key, + is_vpr=is_vpr + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + return Path(filething), operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_origpos = fileobj.tell() + prs = operation(fileobj) + fileobj.seek(fileobj_origpos, 0) + + return fileobj, prs + + +class KGMorVPR(EncryptedBytesIOSkel): """基于 BytesIO 的 KGM/VPR 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -23,29 +100,18 @@ class KGMorVPR(CryptLayerWrappedIOSkel): 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 KGMorVPR 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``KGMorVPR.new()`` 和 ``KGMorVPR.from_file()`` 新建或打开已有 KGM 或 VPR 文件。 - - 已有 KGMorVPR 对象的 ``self.to_file()`` 方法可用于将对象内数据保存到文件,但目前尚未实现。 - 尝试调用此方法会触发 ``NotImplementedError``。 + ``KGMorVPR.new()`` 和 ``KGMorVPR.open()`` 新建或打开已有 KGM/VPR 文件, + 使用已有 KGMorVPR 对象的 ``save()`` 方法将其保存到文件。 """ @property - def cipher(self) -> KGMorVPREncryptAlgorithm: - return self._cipher - - @property - def master_key(self) -> bytes: - return self.cipher.master_key - - @property - def vpr_key(self) -> bytes | None: - return self._cipher.vpr_key + def acceptable_ciphers(self): + return [KGMCryptoLegacy] - @property - def subtype(self): - return 'KGM' if self.vpr_key is None else 'VPR' - - def __init__(self, cipher: KGMorVPREncryptAlgorithm, /, initial_bytes: BytesLike = b'') -> None: + def __init__(self, + cipher: StreamCipherProto | KeyStreamBasedStreamCipherProto, /, + initial_bytes: BytesLike = b'' + ) -> None: """基于 BytesIO 的 KGM/VPR 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -55,56 +121,16 @@ def __init__(self, cipher: KGMorVPREncryptAlgorithm, /, initial_bytes: BytesLike 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 KGMorVPR 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``KGMorVPR.new()`` 和 ``KGMorVPR.from_file()`` 新建或打开已有 KGM 或 VPR 文件。 + ``KGMorVPR.new()`` 和 ``KGMorVPR.open()`` 新建或打开已有 KGM/VPR 文件, + 使用已有 KGMorVPR 对象的 ``save()`` 方法将其保存到文件。 - 已有 KGMorVPR 对象的 ``self.to_file()`` 方法可用于将对象内数据保存到文件,但目前尚未实现。 - 尝试调用此方法会触发 ``NotImplementedError``。 + Args: + cipher: 要使用的 cipher,必须是一个 libtakiyasha.kgmvpr.kgmvprdataciphers.KGMCryptoLegacy 对象 + initial_bytes: 内置缓冲区的初始数据 """ super().__init__(cipher, initial_bytes) - if not isinstance(cipher, KGMorVPREncryptAlgorithm): - raise TypeError('unsupported Cipher: ' - f'supports {KGMorVPREncryptAlgorithm.__module__}.{KGMorVPREncryptAlgorithm.__name__}, ' - f'not {type(cipher).__name__}' - ) - - @classmethod - def new(cls, subtype: Literal['kgm', 'vpr'], /, - table1: BytesLike, - table2: BytesLike, - tablev2: BytesLike, - vpr_key: BytesLike = None - ) -> KGMorVPR: - """创建并返回一个全新的空 KGMorVPR 对象。 - - 第一个位置参数 ``subtype`` 决定此 KGMorVPR 对象的透明加密层使用哪种加密算法, - 仅支持 ``'kgm'`` 和 ``'vpr'``。 - 参数 ``table1``、``table2``、``tablev2`` 都是必选参数, - 因为它们会参与到内置透明加密层的创建过程中,并且在加密/解密过程中发挥关键作用。 - 这三个参数的都必须是类字节对象,且转换为 ``bytes`` 后,长度为 272 字节。 - - 如果你选择 ``subtype='vpr'``,那么参数 ``vpr_key`` 是必选的:必须是类字节对象,且转换为 ``bytes`` - 后的长度为 17 字节。 - """ - table1 = tobytes(table1) - table2 = tobytes(table2) - tablev2 = tobytes(tablev2) - if vpr_key is not None: - vpr_key = tobytes(vpr_key) - if subtype == 'vpr': - if vpr_key is None: - raise ValueError("argument 'vpr_key' is required for VPR subtype") - else: - vpr_key = tobytes(vpr_key) - elif subtype != 'kgm': - if isinstance(subtype, str): - raise ValueError(f"argument 'subtype' must be 'kgm' or 'vpr', not {subtype}") - else: - raise TypeError(f"argument 'subtype' must be str, not {type(subtype).__name__}") - - master_key = make_salt(16) + b'\x00' - - return cls(KGMorVPREncryptAlgorithm(table1, table2, tablev2, master_key, vpr_key)) + self._source_file_header_data: bytes | None = None @classmethod def from_file(cls, @@ -113,8 +139,10 @@ def from_file(cls, table2: BytesLike, tablev2: BytesLike, vpr_key: BytesLike = None - ) -> KGMorVPR: - """打开一个 KGMorVPR 文件或文件对象 ``kgm_vpr_filething``。 + ): + """(已弃用,且将会在后续版本中删除。请尽快使用 ``KGMorVPR.open()`` 代替。) + + 打开一个 KGMorVPR 文件或文件对象 ``kgm_vpr_filething``。 第一个位置参数 ``kgm_vpr_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``kgm_vpr_filething`` @@ -129,59 +157,127 @@ def from_file(cls, 如果探测到 ``VPR`` 文件,那么参数 ``vpr_key`` 是必选的:必须是类字节对象,且转换为 ``bytes`` 后的长度为 17 字节。 """ + warnings.warn( + DeprecationWarning( + f'{cls.__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {cls.__name__}.open() instead.' + ) + ) + return cls.open(kgm_vpr_filething, + table1=table1, + table2=table2, + tablev2=tablev2, + vpr_key=vpr_key + ) - def operation(fileobj: IO[bytes]) -> KGMorVPR: - fileobj_endpos = fileobj.seek(0, 2) - fileobj.seek(0, 0) - magicheader = fileobj.read(16) - if magicheader == b'\x05\x28\xbc\x96\xe9\xe4\x5a\x43\x91\xaa\xbd\xd0\x7a\xf5\x36\x31': - subtype: Literal['kgm', 'vpr'] = 'vpr' - if vpr_key is None: - raise ValueError( - f"{repr(kgm_vpr_filething)} is a VPR file, but argument 'vpr_key' is missing" - ) - elif magicheader == b'\x7c\xd5\x32\xeb\x86\x02\x7f\x4b\xa8\xaf\xa6\x8e\x0f\xff\x99\x14': - subtype: Literal['kgm', 'vpr'] = 'kgm' - else: - raise ValueError(f"{repr(kgm_vpr_filething)} is not a KGM or VPR file") - header_len = int.from_bytes(fileobj.read(4), 'little') - if header_len > fileobj_endpos: - raise CrypterCreatingError( - f"{repr(kgm_vpr_filething)} is not a valid {subtype.upper()} file: " - f"header length ({header_len}) is greater than file size ({fileobj_endpos})" - ) - fileobj.seek(28, 0) - master_key = fileobj.read(16) + b'\x00' - fileobj.seek(header_len, 0) - - initial_bytes = fileobj.read() - - cipher = KGMorVPREncryptAlgorithm(table1, table2, tablev2, master_key, vpr_key) - return cls(cipher, initial_bytes) - + @classmethod + def open(cls, + filething_or_info: tuple[Path | IO[bytes]] | FilePath | IO[bytes], /, + table1: BytesLike, + table2: BytesLike, + tablev2: BytesLike, + vpr_key: BytesLike = None + ): + """打开一个 KGMorVPR 文件,并返回一个 ``KGMorVPR`` 对象。 + + 第一个位置参数 ``filething_or_info`` 需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + ``filething_or_info`` 也可以接受 ``probe_kgmvpr()`` 函数的返回值: + 一个包含两个元素的元组,第一个元素是源文件的路径或文件对象,第二个元素是源文件的信息。 + + 第二、三、四个参数 ``table1``、``table2`` 和 ``tablev2`` + 是必需的,都必须是 272 字节长度的字节串。 + + 如果探测到 VPR 文件,那么第五个参数 ``vpr_key`` + 是必需的。如果提供,则必须是 17 字节长度的字节串。其他情况下,此参数会被忽略。 + + Args: + filething_or_info: 源文件的路径或文件对象,或者 probe_kgmvpr() 的返回值 + table1: 解码表 1 + table2: 解码表 2 + tablev2: 解码表 3 + vpr_key: 针对 VPR 文件额外所需的密钥 + """ + # if table1 is not None: + # table1 = tobytes(table1) + # if table2 is not None: + # table2 = tobytes(table2) + # if tablev2 is not None: + # tablev2 = tobytes(tablev2) + # if vpr_key is not None: + # vpr_key = tobytes(vpr_key) table1 = tobytes(table1) table2 = tobytes(table2) tablev2 = tobytes(tablev2) if vpr_key is not None: vpr_key = tobytes(vpr_key) - if is_filepath(kgm_vpr_filething): - with open(kgm_vpr_filething, mode='rb') as kgm_vpr_fileobj: - instance = operation(kgm_vpr_fileobj) + def operation(fd: IO[bytes]) -> cls: + if fileinfo.encryption_version != 3: + raise CrypterCreatingError( + f'unsupported KGM encryption version {fileinfo.encryption_version} ' + f'(only version 3 is supported)' + ) + if fileinfo.is_vpr and vpr_key is None: + raise TypeError( + "argument 'vpr_key' is required for encrypt and decrypt VPR file" + ) + cipher = KGMCryptoLegacy(table1, + table2, + tablev2, + fileinfo.core_key_test_data + b'\x00', + vpr_key + ) + + fd.seek(fileinfo.cipher_data_offset, 0) + + inst = cls(cipher, fd.read(fileinfo.cipher_data_len)) + fd.seek(0, 0) + inst._source_file_header_data = fd.read(fileinfo.cipher_data_offset) + return inst + + if isinstance(filething_or_info, tuple): + filething_or_info: tuple[Path | IO[bytes], KGMorVPRFileInfo | None] + if len(filething_or_info) != 2: + raise TypeError( + "first argument 'filething_or_info' must be a file path, a file object, " + "or a tuple of probe_kgmvpr() returns" + ) + filething, fileinfo = filething_or_info + else: + filething, fileinfo = probe_kgmvpr(filething_or_info) + + if fileinfo is None: + raise CrypterCreatingError( + f"{repr(filething)} is not a KGM or VPR file" + ) + elif not isinstance(fileinfo, KGMorVPRFileInfo): + raise TypeError( + f"second element of the tuple must be KGMorVPRFileInfo or None, not {type(fileinfo).__name__}" + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + instance = operation(fileobj) + instance._name = Path(filething) else: - kgm_vpr_fileobj = verify_fileobj(kgm_vpr_filething, 'binary', - verify_readable=True, - verify_seekable=True - ) - instance = operation(kgm_vpr_fileobj) + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_sourcefile = getattr(fileobj, 'name', None) + instance = operation(fileobj) - instance._name = getattr(kgm_vpr_fileobj, 'name', None) + if fileobj_sourcefile is not None: + instance._name = Path(fileobj_sourcefile) return instance - def to_file(self, kgm_vpr_filething: FilePath | IO[bytes], /, **kwargs) -> None: - """警告:尚未完全探明 KGM/VPR 文件的结构,因此本方法尚未实现,尝试调用会触发 - ``NotImplementedError``。预计的参数和行为如下: + def to_file(self, kgm_vpr_filething: FilePath | IO[bytes] = None) -> None: + """(已弃用,且将会在后续版本中删除。请尽快使用 ``KGMorVPR.save()`` 代替。) 将当前 KGMorVPR 对象的内容保存到文件 ``kgm_vpr_filething``。 @@ -192,5 +288,93 @@ def to_file(self, kgm_vpr_filething: FilePath | IO[bytes], /, **kwargs) -> None: 本方法会首先尝试写入 ``kgm_vpr_filething`` 指向的文件。 如果未提供 ``kgm_vpr_filething``,则会尝试写入 ``self.name`` 指向的文件。如果两者都为空或未提供,则会触发 ``CrypterSavingError``。 + + 目前无法生成 KGM/VPR 文件的标头数据,因此本方法不能用于保存通过 ``KGMorVPR.new()`` + 创建的 ``KGMorVPR`` 对象。尝试这样做会触发 ``NotImplementedError``。 """ - raise NotImplementedError('coming soon') + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.save() instead.' + ) + ) + + return self.save(kgm_vpr_filething) + + def save(self, + filething: FilePath | IO[bytes] = None + ) -> None: + """(实验性功能)将当前对象保存为一个新 KGM 或 VPR 文件。 + + 第一个参数 ``filething`` 是可选的,如果提供此参数,需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + 如果未提供此参数,那么将会尝试使用当前对象的 ``source`` 属性;如果后者也不可用,则引发 + ``TypeError``。 + + 目前无法生成 KGM/VPR 文件的标头数据,因此本方法不能用于保存通过 ``KGMorVPR.new()`` + 创建的 ``KGMorVPR`` 对象。尝试这样做会触发 ``NotImplementedError``。 + + Args: + filething: 目标文件的路径或文件对象 + """ + + def operation(fd: IO[bytes]) -> None: + if self._source_file_header_data is None: + raise NotImplementedError( + f"cannot save current {type(self).__name__} object to file '{str(filething)}', " + f"generate KGM/VPR file header is not supported" + ) + fd.seek(0, 0) + fd.write(self._source_file_header_data) + while blk := self.read(self.DEFAULT_BUFFER_SIZE, nocryptlayer=True): + fd.write(blk) + + if filething is None: + if self.source is None: + raise TypeError( + "attribute 'self.source' and argument 'filething' are empty, " + "don't know which file to save to" + ) + filething = self.source + + if isfilepath(filething): + with open(filething, mode='wb') as fileobj: + return operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_seekable=True, + verify_writable=True + ) + return operation(fileobj) + + @classmethod + def new(cls, + table1: BytesLike, + table2: BytesLike, + tablev2: BytesLike, + vpr_key: BytesLike = None + ): + """返回一个空 KGMorVPR 对象。 + + 第一、二、三个参数 ``table1``、``table2`` 和 ``tablev2`` + 是必需的,都必须是 272 字节长度的字节串。 + + 如果提供了第五个参数 ``vpr_key``,那么将会使用针对 VPR 的加密方法;这种情况下 + ``vpr_key`` 必须是 17 字节长度的字节串。 + + 注意:通过本方法创建的 ``KGMorVPR`` 对象不可通过 ``save()`` + 方法保存到文件。尝试这样做会触发 ``NotImplementedError``。 + """ + table1 = tobytes(table1) + table2 = tobytes(table2) + tablev2 = tobytes(tablev2) + if vpr_key is not None: + vpr_key = tobytes(vpr_key) + + core_key_test_data = make_salt(16) + b'\x00' + + cipher = KGMCryptoLegacy(table1, table2, tablev2, core_key_test_data, vpr_key) + + return cls(cipher) diff --git a/src/libtakiyasha/kgmvpr/kgmvprdataciphers.py b/src/libtakiyasha/kgmvpr/kgmvprdataciphers.py index 8de3799..c3c5550 100644 --- a/src/libtakiyasha/kgmvpr/kgmvprdataciphers.py +++ b/src/libtakiyasha/kgmvpr/kgmvprdataciphers.py @@ -1,23 +1,28 @@ # -*- coding: utf-8 -*- from __future__ import annotations -from typing import Generator, TypedDict +from hashlib import md5 +from typing import Generator, Literal -from .kgmvprmaskutils import make_maskstream, xor_half_lower_byte -from ..common import StreamCipherSkel +from ..prototypes import KeyStreamBasedStreamCipherSkel from ..typedefs import BytesLike, IntegerLike -from ..typeutils import CachedClassInstanceProperty, tobytes, toint_nofloat +from ..typeutils import CachedClassInstanceProperty, tobytes, toint -__all__ = ['KGMorVPRTables', 'KGMorVPREncryptAlgorithm'] +__all__ = ['KGMCryptoLegacy'] -class KGMorVPRTables(TypedDict): - table1: bytes - table2: bytes - tablev2: bytes +def kugou_md5sum(data: bytes, /) -> bytes: + md5sum = md5(data) + md5digest = md5sum.digest() + ret = bytearray(md5sum.digest_size) + for i in range(0, md5sum.digest_size, 2): + ret[i] = md5digest[14 - i] + ret[i + 1] = md5digest[14 + 1 - i] + return bytes(ret) -class KGMorVPREncryptAlgorithm(StreamCipherSkel): + +class KGMCryptoLegacy(KeyStreamBasedStreamCipherSkel): @CachedClassInstanceProperty def keysize(self) -> int: return 17 @@ -26,46 +31,34 @@ def keysize(self) -> int: def tablesize(self) -> int: return 17 * 16 - @property - def master_key(self) -> bytes: - return self._master_key - - @property - def vpr_key(self) -> bytes | None: - return self._vpr_key - - @property - def tables(self) -> KGMorVPRTables: - return { - 'table1' : self._table1, - 'table2' : self._table2, - 'tablev2': self._tablev2 - } - def __init__(self, table1: BytesLike, table2: BytesLike, tablev2: BytesLike, - master_key: BytesLike, /, + core_key_test_data: BytesLike, /, vpr_key: BytesLike = None, ) -> None: self._table1 = tobytes(table1) self._table2 = tobytes(table2) self._tablev2 = tobytes(tablev2) - for valname, val in [('table1', self._table1), - ('table2', self._table2), - ('tablev2', self._tablev2)]: + for idx, valname_val in enumerate([('table1', self._table1), + ('table2', self._table2), + ('tablev2', self._tablev2)], + start=1 + ): + valname, val = valname_val if len(val) != self.tablesize: raise ValueError( - f"invalid length of argument '{valname}': should be {self.tablesize}, not {len(val)}" + f"invalid length of position {idx} argument '{valname}': " + f"should be {self.tablesize}, not {len(val)}" ) - self._master_key = tobytes(master_key) - if len(self._master_key) != self.keysize: + self._core_key_test_data = tobytes(core_key_test_data) + if len(self._core_key_test_data) != self.keysize: raise ValueError( - f"invalid length of argument 'master_key': " - f"should be {self.keysize}, not {len(self._master_key)}" + f"invalid length of fourth argument 'core_key_test_data': " + f"should be {self.keysize}, not {len(self._core_key_test_data)}" ) if vpr_key is None: self._vpr_key = None @@ -77,51 +70,89 @@ def __init__(self, f"should be {self.keysize}, not {len(self._vpr_key)}" ) - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - raise NotImplementedError - - def encrypt(self, plaindata: BytesLike, offset: IntegerLike = 0, /) -> bytes: - master_key = self._master_key - vpr_key: bytes | None = self._vpr_key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._core_key_test_data + elif keyname == 'table1': + return self._table1 + elif keyname == 'table2': + return self._table2 + elif keyname == 'tablev2': + return self._tablev2 + elif keyname == 'vprkey': + return self._vpr_key + + def prexor_encrypt(self, data: BytesLike, offset: IntegerLike, /) -> Generator[int, None, None]: + offset = toint(offset) + vpr_key = self._vpr_key keysize = self.keysize - - offset = toint_nofloat(offset) - if offset < 0: - ValueError("second argument 'offset' must be a non-negative integer") - plaindata = tobytes(plaindata) - cipherdata_buf = bytearray(len(plaindata)) - - maskstream_iterator = make_maskstream( - offset, len(plaindata), self._table1, self._table2, self._tablev2 - ) - for idx, peered_byte in enumerate(zip(plaindata, maskstream_iterator)): - pdb, msb = peered_byte + for idx, byte in enumerate(data, start=offset): if vpr_key is not None: - pdb ^= vpr_key[(idx + offset) % keysize] - cdb = xor_half_lower_byte(pdb) ^ msb ^ master_key[(idx + offset) % keysize] - cipherdata_buf[idx] = cdb - - return tobytes(cipherdata_buf) - - def decrypt(self, cipherdata: BytesLike, offset: IntegerLike = 0, /) -> bytes: - master_key = self._master_key - vpr_key: bytes | None = self._vpr_key + byte ^= vpr_key[idx % keysize] + yield byte ^ ((byte % 16) << 4) + + @staticmethod + def prexor_decrypt(data: BytesLike, offset: IntegerLike, /) -> Generator[int, None, None]: + offset = toint(offset) + for idx, byte in enumerate(data, start=offset): + yield byte ^ ((byte % 16) << 4) + + def postxor_decrypt(self, data: BytesLike, offset: IntegerLike, /) -> Generator[int, None, None]: + offset = toint(offset) + vpr_key = self._vpr_key keysize = self.keysize - - offset = toint_nofloat(offset) + for idx, byte in enumerate(data, start=offset): + if vpr_key is None: + yield byte + else: + yield byte ^ vpr_key[idx % keysize] + + def genmask(self, + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + nbytes = toint(nbytes) + offset = toint(offset) if offset < 0: - ValueError("second argument 'offset' must be a non-negative integer") - cipherdata = tobytes(cipherdata) - plaindata_buf = bytearray(len(cipherdata)) - - maskstream_iterator = make_maskstream( - offset, len(cipherdata), self._table1, self._table2, self._tablev2 - ) - for idx, peered_byte in enumerate(zip(cipherdata, maskstream_iterator)): - cdb, msb = peered_byte - pdb = xor_half_lower_byte(cdb ^ msb ^ master_key[(idx + offset) % keysize]) - if vpr_key is not None: - pdb ^= vpr_key[(idx + offset) % keysize] - plaindata_buf[idx] = pdb - - return tobytes(plaindata_buf) + raise ValueError("second argument 'offset' must be a non-negative integer") + if nbytes < 0: + raise ValueError("first argument 'nbytes' must be a non-negative integer") + + tablesize: int = self.tablesize + table1 = self._table1 + table2 = self._table2 + tablev2 = self._tablev2 + for idx in range(offset, offset + nbytes): + idx_urs4 = idx >> 4 + value = 0 + while idx_urs4 >= 17: + value ^= table1[idx_urs4 % tablesize] + idx_urs4 >>= 4 + value ^= table2[idx_urs4 % tablesize] + idx_urs4 >>= 4 + yield value ^ tablev2[idx % tablesize] + + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + ck_test_data = self._core_key_test_data + keysize: int = self.keysize + + mask_strm = self.genmask(nbytes, offset) + if operation == 'encrypt': + for idx, msb in enumerate(mask_strm, start=offset): + yield msb ^ ck_test_data[idx % keysize] + elif operation == 'decrypt': + for idx, msb in enumerate(mask_strm, start=offset): + msb ^= ck_test_data[idx % keysize] + yield msb ^ ((msb % 16) << 4) + elif isinstance(operation, str): + raise ValueError( + f"first argument 'operation' must be 'encrypt' or 'decrypt', not {operation}" + ) + else: + raise TypeError( + f"first argument 'operation' must be str, not {type(operation).__name__}" + ) diff --git a/src/libtakiyasha/kgmvpr/kgmvprmaskutils.py b/src/libtakiyasha/kgmvpr/kgmvprmaskutils.py deleted file mode 100644 index ecd8c3e..0000000 --- a/src/libtakiyasha/kgmvpr/kgmvprmaskutils.py +++ /dev/null @@ -1,43 +0,0 @@ -# -*- coding: utf-8 -*- -from __future__ import annotations - -from typing import Generator - -from ..typedefs import BytesLike, IntegerLike -from ..typeutils import tobytes, toint_nofloat - -__all__ = ['make_maskstream', 'xor_half_lower_byte'] - - -def xor_half_lower_byte(byte: int) -> int: - return byte ^ ((byte % 16) << 4) - - -def make_maskstream(offset: IntegerLike, - length: IntegerLike, /, - table1: BytesLike, - table2: BytesLike, - tablev2: BytesLike - ) -> Generator[int, None, None]: - offset = toint_nofloat(offset) - length = toint_nofloat(length) - if offset < 0: - raise ValueError("first argument 'offset' must be a non-negative integer") - if length < 0: - raise ValueError("second argument 'length' must be a non-negative integer") - table1 = tobytes(table1) - table2 = tobytes(table2) - tablev2 = tobytes(tablev2) - if not (len(table1) == len(table2) == len(tablev2)): - raise ValueError("argument 'table1', 'table2', 'tablev2' must have the same length") - tablesize = len(tablev2) - - for idx in range(offset, offset + length): - idx_urs4 = idx >> 4 - value = 0 - while idx_urs4 >= 17: - value ^= table1[idx_urs4 % tablesize] - idx_urs4 >>= 4 - value ^= table2[idx_urs4 % tablesize] - idx_urs4 >>= 4 - yield value ^ tablev2[idx % tablesize] diff --git a/src/libtakiyasha/kwm/__init__.py b/src/libtakiyasha/kwm/__init__.py index 9c0bb0c..e6a2b70 100644 --- a/src/libtakiyasha/kwm/__init__.py +++ b/src/libtakiyasha/kwm/__init__.py @@ -1,16 +1,103 @@ # -*- coding: utf-8 -*- from __future__ import annotations -from typing import IO +import warnings +from math import log10 +from pathlib import Path +from typing import IO, NamedTuple -from .kwmdataciphers import Mask32 -from ..common import CryptLayerWrappedIOSkel +from .kwmdataciphers import Mask32, Mask32FromRecipe +from ..exceptions import CrypterCreatingError, CrypterSavingError from ..keyutils import make_salt -from ..typedefs import BytesLike, FilePath -from ..typeutils import is_filepath, tobytes, verify_fileobj +from ..prototypes import EncryptedBytesIOSkel +from ..typedefs import BytesLike, FilePath, IntegerLike, KeyStreamBasedStreamCipherProto, StreamCipherProto +from ..typeutils import isfilepath, tobytes, toint, verify_fileobj +warnings.filterwarnings(action='default', category=DeprecationWarning, module=__name__) -class KWM(CryptLayerWrappedIOSkel): +DIGIT_CHARS = b'0123456789' +ASCII_LETTER_CHARS = b'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' + +__all__ = ['KWM', 'probe_kwm', 'KWMFileInfo'] + + +class KWMFileInfo(NamedTuple): + mask_recipe: bytes + cipher_data_offset: int + cipher_data_len: int + bitrate: int | None + suffix: str + + +def probe_kwm(filething: FilePath | IO[bytes], /) -> tuple[Path | IO[bytes], KWMFileInfo | None]: + """探测源文件 ``filething`` 是否为一个 KWM 文件。 + + 返回一个 2 个元素长度的元组:第一个元素为 ``filething``;如果 + ``filething`` 是 KWM 文件,那么第二个元素为一个 ``KWMFileInfo`` 对象;否则为 ``None``。 + + 本方法的返回值可以用于 ``KWM.open()`` 的第一个位置参数。 + + Args: + filething: 源文件的路径或文件对象 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 KWM 文件,那么第二个元素为一个 KWMFileInfo 对象;否则为 None。 + """ + + def operation(fd: IO[bytes]) -> KWMFileInfo | None: + fd.seek(0, 0) + + header_data = fd.read(1024) + cipher_data_offset = fd.tell() + cipher_data_len = fd.seek(0, 2) - cipher_data_offset + + if not header_data.startswith(b'yeelion-kuwo'): + return + + mask_recipe = header_data[24:32] + + bitrate = None + bitrate_suffix_serialized = header_data[48:56].rstrip(b'\x00') + bitrate_serialized_len = 0 + for byte in bitrate_suffix_serialized: + if byte in DIGIT_CHARS: + bitrate_serialized_len += 1 + if bitrate_serialized_len > 0: + bitrate = int(bitrate_suffix_serialized[:bitrate_serialized_len]) * 1000 + + suffix = None + suffix_serialized = bitrate_suffix_serialized[bitrate_serialized_len:] + suffix_serialized_len = 0 + for byte in suffix_serialized: + if byte in ASCII_LETTER_CHARS: + suffix_serialized_len += 1 + if suffix_serialized_len > 0: + suffix = suffix_serialized[:suffix_serialized_len].decode('ascii') + + return KWMFileInfo( + mask_recipe=mask_recipe, + cipher_data_offset=cipher_data_offset, + cipher_data_len=cipher_data_len, + bitrate=bitrate, + suffix=suffix + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + return Path(filething), operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_origpos = fileobj.tell() + prs = operation(fileobj) + fileobj.seek(fileobj_origpos, 0) + + return fileobj, prs + + +class KWM(EncryptedBytesIOSkel): """基于 BytesIO 的 KWM 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -20,25 +107,18 @@ class KWM(CryptLayerWrappedIOSkel): 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 KWM 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``KWM.new()`` 和 ``KWM.from_file()`` 新建或打开已有 KWM 文件。 - - 已有 KWM 对象的 ``self.to_file()`` 方法可用于将对象内数据保存到文件,但目前尚未实现。 - 尝试调用此方法会触发 ``NotImplementedError``。 + ``KWM.new()`` 和 ``KWM.open()`` 新建或打开已有 KWM 文件, + 使用已有 KWM 对象的 ``save()`` 方法将其保存到文件。 """ @property - def cipher(self) -> Mask32: - return self._cipher + def acceptable_ciphers(self): + return [Mask32FromRecipe] - @property - def core_key(self) -> bytes: - return self.cipher.core_key - - @property - def master_key(self) -> bytes: - return self.cipher.master_key - - def __init__(self, cipher: Mask32, /, initial_bytes: BytesLike = b'') -> None: + def __init__(self, + cipher: StreamCipherProto | KeyStreamBasedStreamCipherProto, /, + initial_bytes: BytesLike = b'' + ) -> None: """基于 BytesIO 的 KWM 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -48,37 +128,112 @@ def __init__(self, cipher: Mask32, /, initial_bytes: BytesLike = b'') -> None: 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 KWM 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``KWM.new()`` 和 ``KWM.from_file()`` 新建或打开已有 KWM 文件。 + ``KWM.new()`` 和 ``KWM.open()`` 新建或打开已有 KWM 文件, + 使用已有 KWM 对象的 ``save()`` 方法将其保存到文件。 - 已有 KWM 对象的 ``self.to_file()`` 方法可用于将对象内数据保存到文件,但目前尚未实现。 - 尝试调用此方法会触发 ``NotImplementedError``。 + Args: + cipher: 要使用的 cipher,必须是一个 libtakiyasha.kwm.kwmdataciphers.Mask32/Mask32FromRecipe 对象 + initial_bytes: 内置缓冲区的初始数据 """ - super().__init__(cipher, initial_bytes) - if not isinstance(cipher, Mask32): - raise TypeError('unsupported Cipher: ' - f'supports {Mask32.__module__}.{Mask32.__name__}, ' - f'not {type(cipher).__name__}' - ) + super().__init__(cipher, initial_bytes=initial_bytes) - @classmethod - def new(cls, core_key: BytesLike) -> KWM: - """创建并返回一个全新的空 KWM 对象。 + self._bitrate: int | None = None + self._suffix: str | None = None + + @property + def bitrate(self) -> int | None: + """音频的比特率。如果要用作显示用途,需要除以 1000。 - 第一个参数 ``core_key`` 是必需的,它被用于还原和解密主密钥。 + 不可设置为负数;如果不为 ``None``,其字面量长度与后缀 ``self.suffix`` 的长度不可超过 8。 """ - core_key = tobytes(core_key) + return self._bitrate - master_key = make_salt(8) - cipher = Mask32(core_key, master_key) + @bitrate.setter + def bitrate(self, value: IntegerLike) -> None: + """音频的比特率。如果要用作显示用途,需要除以 1000。 - return cls(cipher) + 不可设置为负数;如果不为 ``None``,其字面量长度与后缀 ``self.suffix`` 的长度不可超过 8。 + """ + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'bitrate'. " + f"Use `del self.bitrate` instead" + ) + br = toint(value) + if br < 0: + raise ValueError(f"attribute 'bitrate' must be a non-netagive integer, not {value}") + if self._suffix is None: + max_bitrate_len = 8 + else: + max_bitrate_len = 8 - len(self._suffix) + bitrate_len = int(log10(br // 1000)) + 1 + if bitrate_len > max_bitrate_len: + raise ValueError(f"attribute 'bitrate' must be less than {max_bitrate_len}, not {bitrate_len}") + + self._bitrate = br + + @bitrate.deleter + def bitrate(self) -> None: + """音频的比特率。本属性储存的是乘以 1000 后的结果。 + + 不可设置为负数;如果不为 ``None``,其整除 1000 后的字面量长度与后缀 + ``self.suffix`` 的长度之和不可大于 8。 + """ + self._bitrate = None + + @property + def suffix(self) -> int | None: + """加密数据对应的文件应当使用的后缀。由于不够精确,不建议使用。 + + 如果不为 None,其长度与比特率 ``self.bitrate`` 整除 1000 后的字面量长度之和不可大于 8。 + """ + return self._suffix + + @suffix.setter + def suffix(self, value: str) -> None: + """加密数据对应的文件应当使用的后缀。由于不够精确,不建议使用。 + + 如果不为 None,其长度与比特率 ``self.bitrate`` 整除 1000 后的字面量长度之和不可大于 8。 + """ + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'suffix'. " + f"Use `del self.suffix` instead" + ) + if not isinstance(value, str): + raise TypeError(f"attribute 'suffix' must be str, not {type(value).__name__}") + value = str(value) + if self._bitrate is None: + max_suffix_len = 8 + else: + max_suffix_len = 8 - (int(log10(self._bitrate // 1000)) + 1) + if len(value) > max_suffix_len: + raise ValueError( + f"attribute 'bitrate' must be less than {max_suffix_len}, not {len(value)}" + ) + for char in (ord(_) for _ in value): + if char not in ASCII_LETTER_CHARS: + raise ValueError( + f"attribute 'suffix' can only contains digits and ascii letters, but '{chr(char)}' found" + ) + self._suffix = value + + @suffix.deleter + def suffix(self) -> None: + """加密数据对应的文件应当使用的后缀。由于不够精确,不建议使用。 + + 如果不为 None,其长度与比特率 ``self.bitrate`` 整除 1000 后的字面量长度之和不可大于 8。 + """ + self._suffix = None @classmethod def from_file(cls, kwm_filething: FilePath | IO[bytes], /, core_key: BytesLike ): - """打开一个 KWM 文件或文件对象 ``kwm_filething``。 + """(已弃用,且将会在后续版本中删除。请尽快使用 ``KWM.open()`` 代替。) + + 打开一个 KWM 文件或文件对象 ``kwm_filething``。 第一个位置参数 ``kwm_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``kwm_filething`` @@ -86,46 +241,180 @@ def from_file(cls, 第二个参数 ``core_key`` 是必需的,它被用于还原和解密主密钥。 """ + warnings.warn( + DeprecationWarning( + f'{cls.__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {cls.__name__}.open() instead.' + ) + ) - def operation(fileobj: IO[bytes]) -> cls: - if not fileobj.read(24).startswith(b'yeelion-kuwo-tme'): - raise ValueError(f"{repr(kwm_filething)} is not a KWM file") - - master_key = fileobj.read(8) - cipher = Mask32(core_key, master_key) - - fileobj.seek(1024, 0) - initial_bytes = fileobj.read() - - return cls(cipher, initial_bytes) - - core_key = tobytes(core_key) + return cls.open(kwm_filething, core_key=core_key) - if is_filepath(kwm_filething): - with open(kwm_filething, mode='rb') as kwm_fileobj: - instance = operation(kwm_fileobj) + @classmethod + def open(cls, + filething_or_info: tuple[Path | IO[bytes], KWMFileInfo | None] | FilePath | IO[bytes], /, + core_key: BytesLike = None, + master_key: BytesLike = None + ): + """打开一个 KWM 文件,并返回一个 ``KWM`` 对象。 + + 第一个位置参数 ``filething_or_info`` 需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + ``filething_or_info`` 也可以接受 ``probe_kwm()`` 函数的返回值: + 一个包含两个元素的元组,第一个元素是源文件的路径或文件对象,第二个元素是源文件的信息。 + + 第二个参数 ``core_key`` 一般情况下是必需的,用于解密文件内嵌的主密钥。 + 例外:如果你提供了第三个参数 ``mask``,那么它是可选的。 + + 第三个参数 ``mask`` 可选,如果提供,将会被作为主密钥使用, + 而文件内置的主密钥会被忽略,``core_key`` 也不再是必需参数。 + + Args: + filething_or_info: 源文件的路径或文件对象,或者 probe_kwm() 的返回值 + core_key: 核心密钥,用于生成文件内加密数据的主密钥 + master_key: 如果提供,将会被作为主密钥使用,而文件内置的主密钥会被忽略 + """ + if core_key is not None: + core_key = tobytes(core_key) + if master_key is not None: + master_key = tobytes(master_key) + + def operation(fd: IO[bytes]) -> cls: + fd.seek(1024, 0) + initial_bytes = fd.read() + + if master_key is not None: + cipher = Mask32(master_key) + elif core_key is None: + raise TypeError( + "argument 'core_key' is required to " + "generate the master key" + ) + else: + cipher = Mask32FromRecipe(fileinfo.mask_recipe, core_key) + + inst = cls(cipher, initial_bytes) + inst._bitrate = fileinfo.bitrate + inst._suffix = fileinfo.suffix + + return inst + + if isinstance(filething_or_info, tuple): + filething_or_info: tuple[Path | IO[bytes], KWMFileInfo | None] + if len(filething_or_info) != 2: + raise TypeError( + "first argument 'filething_or_info' must be a file path, a file object, " + "or a tuple of probe_kwm() returns" + ) + filething, fileinfo = filething_or_info + else: + filething, fileinfo = probe_kwm(filething_or_info) + + if fileinfo is None: + raise CrypterCreatingError( + f"{repr(filething)} is not a KWM file" + ) + elif not isinstance(fileinfo, KWMFileInfo): + raise TypeError( + f"second element of the tuple must be KWMFileInfo or None, not {type(fileinfo).__name__}" + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + instance = operation(fileobj) + instance._name = Path(filething) else: - kwm_fileobj = verify_fileobj(kwm_filething, 'binary', - verify_readable=True, - verify_seekable=True - ) + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_sourcefile = getattr(fileobj, 'name', None) + instance = operation(fileobj) - instance._name = getattr(kwm_fileobj, 'name', None) + if fileobj_sourcefile is not None: + instance._name = Path(fileobj_sourcefile) return instance - def to_file(self, kwm_filething: FilePath | IO[bytes]) -> None: - """警告:尚未完全探明 KWM 文件的结构,因此本方法尚未实现,尝试调用会触发 - ``NotImplementedError``。预计的参数和行为如下: + def to_file(self, kwm_filething: FilePath | IO[bytes] = None) -> None: + """(已弃用,且将会在后续版本中删除。请尽快使用 ``KWM.save()`` 代替。)""" + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.save() instead.' + ) + ) + + return self.save(kwm_filething) + + def save(self, + filething: FilePath | IO[bytes] = None, + newer_magic_header: bool = False + ) -> None: + """(实验性功能)将当前对象保存为一个新 KWM 文件。 + + 第一个参数 ``filething`` 是可选的,如果提供此参数,需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + 如果未提供此参数,那么将会尝试使用当前对象的 ``source`` 属性;如果后者也不可用,则引发 + ``TypeError``。 + + 第二个参数 ``newer_magic_header`` 可选,如果为 ``True``,那么保存的文件会使用新版 KWM\ + 文件的文件头 ``b'yeelion-kuwo'``;否则使用 ``b'yeelion-kuwo-tme'``。 + + Args: + filething: 目标文件的路径或文件对象 + newer_magic_header: 是否使用新版 KWM 文件使用的文件头 + """ - 将当前 KWM 对象的内容保存到文件 ``kwm_filething``。 + def operation(fd: IO[bytes]) -> None: + recipe = self.cipher.getkey('original') + if not recipe: + raise CrypterSavingError('cannot store a non-standard recipe into a KWM file') + + fd.seek(0, 0) + if newer_magic_header: + fd.write(b'yeelion-kuwo') + else: + fd.write(b'yeelion-kuwo-tme') + fd.seek(24, 0) + fd.write(recipe) + fd.seek(48, 0) + if self._bitrate is not None: + fd.write(str(self._bitrate // 1000).encode('ascii')) + if self._suffix is not None: + fd.write(self._suffix.encode('ascii')) + fd.seek(1024, 0) + fd.write(self.getvalue(nocryptlayer=True)) + + if filething is None: + if self.source is None: + raise TypeError( + "attribute 'self.source' and argument 'filething' are empty, " + "don't know which file to save to" + ) + filething = self.source + + if isfilepath(filething): + with open(filething, mode='wb') as fileobj: + return operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_seekable=True, + verify_writable=True + ) + return operation(fileobj) - 第一个位置参数 ``kwm_filething`` 可以是文件路径(``str``、``bytes`` - 或任何拥有方法 ``__fspath__()`` 的对象)。``kwm_filething`` - 也可以是一个文件对象,但必须可写。 + @classmethod + def new(cls, core_key: BytesLike): + """返回一个空 KWM 对象。""" + core_key = tobytes(core_key) - 本方法会首先尝试写入 ``kwm_filething`` 指向的文件。 - 如果未提供 ``kwm_filething``,则会尝试写入 ``self.name`` - 指向的文件。如果两者都为空或未提供,则会触发 ``CrypterSavingError``。 - """ - raise NotImplementedError('coming soon') + recipe = make_salt(8) + cipher = Mask32FromRecipe(recipe, core_key) + + return cls(cipher) diff --git a/src/libtakiyasha/kwm/kwmdataciphers.py b/src/libtakiyasha/kwm/kwmdataciphers.py index fd07096..f646788 100644 --- a/src/libtakiyasha/kwm/kwmdataciphers.py +++ b/src/libtakiyasha/kwm/kwmdataciphers.py @@ -1,41 +1,74 @@ # -*- coding: utf-8 -*- from __future__ import annotations -from typing import Generator +from typing import Generator, Literal -from ..common import StreamCipherSkel from ..miscutils import bytestrxor +from ..prototypes import KeyStreamBasedStreamCipherSkel from ..typedefs import BytesLike, IntegerLike -from ..typeutils import tobytes, toint_nofloat +from ..typeutils import tobytes, toint -__all__ = ['Mask32'] +__all__ = ['Mask32', 'Mask32FromRecipe'] -class Mask32(StreamCipherSkel): - @property - def core_key(self) -> bytes: - return self._core_key +class Mask32(KeyStreamBasedStreamCipherSkel): + def __init__(self, mask32: BytesLike, /) -> None: + self._mask32 = tobytes(mask32) + if len(self._mask32) != 32: + raise ValueError(f"invalid mask length: should be 32, got {len(self._mask32)}") - @property - def master_key(self) -> bytes: - return self._master_key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._mask32 - @property - def mask32(self) -> bytes: - return self._mask32 + @classmethod + def cls_keystream(cls, mask32: BytesLike, nbytes: IntegerLike, offset: IntegerLike, /) -> Generator[int, None, None]: + offset = toint(offset) + nbytes = toint(nbytes) + if offset < 0: + raise ValueError("third argument 'offset' must be a non-negative integer") + if nbytes < 0: + raise ValueError("second argument 'nbytes' must be a non-negative integer") + maskblk_data: bytes = tobytes(mask32) + maskblk_len = len(maskblk_data) + if maskblk_len != 32: + raise ValueError(f"invalid mask length: should be 32, not {maskblk_len}") + + target_in_maskblk_len = nbytes + target_offset_in_maskblk = offset % maskblk_len + if target_offset_in_maskblk == 0: + target_before_maskblk_area_len = 0 + else: + target_before_maskblk_area_len = maskblk_len - target_offset_in_maskblk + yield from maskblk_data[target_offset_in_maskblk:target_offset_in_maskblk + target_before_maskblk_area_len] + target_in_maskblk_len -= target_before_maskblk_area_len + + target_overrided_whole_maskblk_count = target_in_maskblk_len // maskblk_len + target_after_maskblk_area_len = target_in_maskblk_len % maskblk_len + + for _ in range(target_overrided_whole_maskblk_count): + yield from maskblk_data + yield from maskblk_data[:target_after_maskblk_area_len] - def __init__(self, core_key: BytesLike, master_key=BytesLike, /) -> None: + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + yield from self.cls_keystream(self._mask32, nbytes, offset) + + +class Mask32FromRecipe(Mask32): + def __init__(self, recipe: BytesLike, core_key: BytesLike, /) -> None: + recipe = tobytes(recipe) core_key = tobytes(core_key) - master_key = tobytes(master_key) - for varname, var, expectlen in ('core_key', core_key, 32), ('master_key', master_key, 8): + for varname, var, expectlen in ('core_key', core_key, 32), ('recipe', recipe, 8): if len(var) != expectlen: f"invalid length of argument '{varname}': should be {expectlen}, not {len(var)}" - self._core_key = core_key - self._master_key = master_key - - mask_stage1 = str(int.from_bytes(master_key, 'little')) + mask_recipe_unpacked = int.from_bytes(recipe, 'little') + mask_stage1 = str(mask_recipe_unpacked).encode('ascii') if len(mask_stage1) >= 32: mask_stage2 = mask_stage1[:32] else: @@ -48,43 +81,19 @@ def __init__(self, core_key: BytesLike, master_key=BytesLike, /) -> None: for _ in range(mask_stage2_stage1_fullpad_count): mask_stage2_composition.append(mask_stage1) mask_stage2_composition.append(mask_stage1[:mask_stage2_remain_len]) - mask_stage2 = ''.join(mask_stage2_composition).encode('utf-8') + mask_stage2 = b''.join(mask_stage2_composition) mask_final = bytestrxor(mask_stage2, core_key) - self._mask32 = mask_final - - @classmethod - def cls_keystream(cls, - offset: IntegerLike, - length: IntegerLike, /, - mask32: BytesLike - ) -> Generator[int, None, None]: - offset = toint_nofloat(offset) - length = toint_nofloat(length) - if offset < 0: - raise ValueError("first argument 'offset' must be a non-negative integer") - if length < 0: - raise ValueError("second argument 'length' must be a non-negative integer") - maskblk_data: bytes = tobytes(mask32) - maskblk_len = len(maskblk_data) - if maskblk_len != 32: - raise ValueError(f"invalid mask length: should be 32, not {maskblk_len}") - target_in_maskblk_len = length - target_offset_in_maskblk = offset % maskblk_len - if target_offset_in_maskblk == 0: - target_before_maskblk_area_len = 0 - else: - target_before_maskblk_area_len = maskblk_len - target_offset_in_maskblk - yield from maskblk_data[target_offset_in_maskblk:target_offset_in_maskblk + target_before_maskblk_area_len] - target_in_maskblk_len -= target_before_maskblk_area_len - - target_overrided_whole_maskblk_count = target_in_maskblk_len // maskblk_len - target_after_maskblk_area_len = target_in_maskblk_len % maskblk_len + self._source_recipe = recipe + self._core_key = core_key - for _ in range(target_overrided_whole_maskblk_count): - yield from maskblk_data - yield from maskblk_data[:target_after_maskblk_area_len] + super().__init__(mask_final) - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - yield from self.cls_keystream(offset, length, self._mask32) + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'original': + return self._source_recipe + elif keyname == 'core': + return self._core_key + else: + return super().getkey(keyname) diff --git a/src/libtakiyasha/miscutils.py b/src/libtakiyasha/miscutils.py index 0a3921f..05abc03 100644 --- a/src/libtakiyasha/miscutils.py +++ b/src/libtakiyasha/miscutils.py @@ -1,16 +1,20 @@ # -*- coding: utf-8 -*- from __future__ import annotations +from pathlib import Path from typing import Iterable, Mapping from .typedefs import BytesLike, KT, T, VT from .typeutils import tobytes __all__ = [ + 'BINARIES_ROOTDIR', 'bytestrxor', 'getattribute' ] +BINARIES_ROOTDIR = Path(__file__).parent / 'binaries' + def getattribute(obj: object, name: str, @@ -75,6 +79,9 @@ def bytestrxor(term1: BytesLike, term2: BytesLike, /) -> bytes: bytestring2 = tobytes(term2) if len(bytestring1) != len(bytestring2): - raise ValueError('only byte strings of equal length can be xored') + raise ValueError( + 'only byte strings of equal length can be xored: ' + f'term 1 ({len(bytestring1)}) != term 2 ({len(bytestring2)})' + ) return bytes(b1 ^ b2 for b1, b2 in zip(bytestring1, bytestring2)) diff --git a/src/libtakiyasha/ncm.py b/src/libtakiyasha/ncm.py index a6e0135..53b1afb 100644 --- a/src/libtakiyasha/ncm.py +++ b/src/libtakiyasha/ncm.py @@ -5,40 +5,37 @@ import warnings from base64 import b64decode, b64encode from dataclasses import asdict, dataclass, field as dcfield +from pathlib import Path from secrets import token_bytes -from typing import Any, IO, Iterable, Mapping, TypedDict +from typing import Callable, IO, Literal, NamedTuple, Type -from .common import CryptLayerWrappedIOSkel -from .exceptions import CrypterCreatingError, CrypterSavingError +from mutagen import flac, id3 + +from .exceptions import CrypterCreatingError from .keyutils import make_random_ascii_string, make_random_number_string -from .miscutils import bytestrxor +from .miscutils import BINARIES_ROOTDIR, bytestrxor +from .prototypes import EncryptedBytesIOSkel from .stdciphers import ARC4, StreamedAESWithModeECB from .typedefs import BytesLike, FilePath -from .typeutils import is_filepath, tobytes, verify_fileobj +from .typeutils import isfilepath, tobytes, verify_fileobj from .warns import CrypterCreatingWarning -__all__ = ['CloudMusicIdentifier', 'NCM'] +warnings.filterwarnings(action='default', category=DeprecationWarning, module=__name__) -_TAG_KEY = b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28' +__all__ = ['CloudMusicIdentifier', 'NCM', 'probe_ncm', 'NCMFileInfo'] -MutagenStyleDict = TypedDict( - 'MutagenStyleDict', { - 'TIT2' : list, - 'TPE1' : list, - 'TALB' : list, - 'TXXX::comment': list, - 'title' : list, - 'artist' : list, - 'album' : list, - 'comment' : list - }, - total=False -) +MODULE_BINARIES_ROOTDIR = BINARIES_ROOTDIR / Path(__file__).stem -@dataclass +@dataclass(init=True) class CloudMusicIdentifier: """解析、储存和重建网易云音乐 163key 。""" + + def __post_init__(self) -> None: + self._orig_ncm_tag: dict | None = None + self._orig_ncm_163key: bytes | None = None + self._orig_tag_key: bytes | None = None + format: str = '' musicId: str = '' musicName: str = '' @@ -56,50 +53,189 @@ class CloudMusicIdentifier: alias: list[str] = dcfield(default_factory=list) transNames: list[str] = dcfield(default_factory=list) + def to_mutagen_tag(self, + tag_type: Literal['FLAC', 'ID3'] = None, + with_ncm_163key: bool = True, + tag_key: BytesLike | None = None, + return_cached_first: bytes = True + ) -> flac.FLAC | id3.ID3: + """将 CloudMusicIdentifier 对象导出为 Mutagen 库使用的标签格式实例: + ``mutagen.flac.FLAC`` 和 ``mutagen.id3.ID3``。 + + ``tag_type`` 用于选择需要导出为何种格式的标签实例,仅支持 ``FLAC`` + 和 ``ID3``。如果留空,则根据 ``self.format`` 决定。如果两者都为空,则会触发 ``ValueError``。 + + Args: + tag_type: 需要导出为何种格式的标签实例,仅支持 'FLAC' 和 'ID3' + with_ncm_163key: 是否在导出的标签中嵌入 163key + tag_key: (仅当 with_163key=True)歌曲信息密钥,用于加密 163key,以便将其写入注释 + return_cached_first: (仅当 with_163key=True)在满足特定条件时,将缓存的 163key 写入注释,而不是重新生成一个 + + Examples: + >>> from mutagen import flac, mp3 + [...] + >>> ncm_tag1: CloudMusicIdentifier + >>> ncm_tag1.format + 'flac' + >>> mutagen_tag1 = ncm_tag1.to_mutagen_tag() + >>> type(mutagen_tag1) + mutagen.flac.FLAC + >>> flactag = flac.FLAC('test.flac') + >>> flactag.update(mutagen_tag1) + >>> flactag.save() + >>> + [...] + >>> ncm_tag2: CloudMusicIdentifier + >>> ncm_tag2.format + 'mp3' + >>> mutagen_tag2 = ncm_tag2.to_mutagen_tag() + >>> type(mutagen_tag2) + mutagen.id3.ID3 + >>> mp3tag = mp3.MP3('test.mp3') + >>> mp3tag.update(mutagen_tag2) + >>> mp3tag.save() + >>> + """ + if tag_type is None: + if self.format.lower() == 'flac': + tag_type = 'FLAC' + elif self.format.lower() == 'mp3': + tag_type = 'ID3' + elif not self.format: + raise ValueError( + "don't know which type of tag is needed: " + "self.format and 'tag_type' are empty" + ) + else: + raise ValueError( + "don't know which type of tag is needed: " + "'tag_type' is empty, and the value of self.format is not supported" + ) + if tag_type == 'FLAC': + with open(MODULE_BINARIES_ROOTDIR / 'empty.flac', mode='rb') as _f: + # 受 mutagen 功能限制,编辑 FLAC 标签之前必须打开一个空 FLAC 文件 + tag: flac.FLAC | id3.ID3 = flac.FLAC(_f) + keymaps = { + 'musicName': ('title', lambda _: [_]), + 'artist' : ('artist', lambda _: list(str(list(__)[0]) for __ in list(_))), + 'album' : ('album', lambda _: [_]) + } + elif tag_type == 'ID3': + tag: flac.FLAC | id3.ID3 = id3.ID3() + keymaps = { + 'musicName': ('TIT2', lambda _: id3.TIT2(text=[_], encoding=3)), + 'artist' : ('TPE1', lambda _: id3.TPE1(text=list(str(list(__)[0]) for __ in list(_)), encoding=3)), + 'album' : ('TALB', lambda _: id3.TALB(text=[_], encoding=3)) + } + else: + raise ValueError( + f"'tag_type' must be 'FLAC', 'ID3', or None, not {repr(tag_type)}" + ) + + tagkey_constructor: tuple[str, Callable[[str], list[str]] | Callable[[str], id3.Frame]] + for attrname, tagkey_constructor in keymaps.items(): + tagkey, constructor = tagkey_constructor + attr = getattr(self, attrname) + if attr: + tag[tagkey] = constructor(attr) + + if with_ncm_163key: + ncm_163key = self.to_ncm_163key(tag_key=tag_key, + return_cached_first=return_cached_first + ) + if isinstance(tag, flac.FLAC): + tag['description'] = [ncm_163key.decode('ascii')] + elif isinstance(tag, id3.ID3): + tag['TXXX::comment'] = id3.TXXX(encoding=3, desc='comment', text=[ncm_163key.decode('ascii')]) + + return tag + @classmethod - def from_ncm_163key(cls, - ncm_163key_maybe_xored: BytesLike, /, - is_xored: bool = False - ) -> CloudMusicIdentifier: - """解析 163key,返回一个储存解析结果的 ``CloudMusicIdentifier`` 对象。 + def from_ncm_163key(cls, ncm_163key: str | BytesLike, /, tag_key: BytesLike = None): + """将一个 163key 字符串/字节对象转换为 CloudMusicIdentifier 对象。 - 第一个位置参数 ``ncm_163key_maybe_xored`` 是需要解析的 163key 字节串。 + 本方法会缓存给定的 163key,以及该 163key 解密后的结果,用于确保 ``self.to_ncm_163key()`` + 返回值的一致性。 - 如果 ``is_xored=True``,那么本方法会在解析之前,将 ``ncm_163key_maybe_xored`` - 的每一个字节都与 ``0x63`` 进行 XOR。一般情况下不需要提供此参数。 + Args: + ncm_163key: 以“163key”开头的字符串/字节对象 + tag_key: 歌曲信息密钥,用于解密 163key """ - ncm_163key_maybe_xored = tobytes(ncm_163key_maybe_xored) - - if is_xored: - ncm_163key = bytestrxor(b'c' * len(ncm_163key_maybe_xored), - ncm_163key_maybe_xored - ) + if isinstance(ncm_163key, str): + ncm_163key = bytes(ncm_163key, encoding='utf-8') else: - ncm_163key = ncm_163key_maybe_xored + ncm_163key = tobytes(ncm_163key) + if tag_key is None: + tag_key = b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28' + else: + tag_key = tobytes(tag_key) + + ncm_tag_serialized_encrypted_encoded = ncm_163key[22:] # 去除开头的 b"163 key(Don't modify):" + ncm_tag_serialized_encrypted = b64decode(ncm_tag_serialized_encrypted_encoded, validate=True) + ncm_tag_serialized = StreamedAESWithModeECB(tag_key).decrypt( + ncm_tag_serialized_encrypted + )[6:] # 去除字节串开头的 b'music:' + ncm_tag = json.loads(ncm_tag_serialized) + + instance = cls(**ncm_tag) + instance._orig_ncm_tag = ncm_tag + instance._orig_ncm_163key = ncm_163key + instance._orig_tag_key = tag_key + return instance - ncm_tag_bytestr_encrypted_encoded = ncm_163key[22:] # 去除开头的 b"163 key(Don't modify):" - ncm_tag_bytestr_encrypted = b64decode(ncm_tag_bytestr_encrypted_encoded, validate=True) - ncm_tag_bytestr = StreamedAESWithModeECB(_TAG_KEY).decrypt(ncm_tag_bytestr_encrypted)[6:] # 去除字节串开头的 b'music:' + def to_ncm_163key(self, + tag_key: BytesLike = None, + return_cached_first: bytes = True + ) -> bytes: + """将 CloudMusicIdentifier 对象导出为 163key。 - return cls(**json.loads(ncm_tag_bytestr)) + 第一个参数 ``tag_key`` 用于解密 163key。如果留空,则使用默认值: + ``b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28'`` - def to_ncm_163key(self, with_xor: bool = False) -> bytes: - """根据当前对象储存的解析结果,重建并返回一个 163key。 + 第二个参数 ``return_cached`` 如果为 ``True``, + 那么在当前对象转换而来的字典(下称当前字典)与缓存的 163key 解密得到的字典(下称缓存字典) + 满足以下条件时,本方法会直接返回 ``self.from_ncm_163key()`` 缓存的 163key: - 如果 ``with_xor=True``,那么本方法在返回结果之间, - 将结果字节串的每一个字节都与 ``0x63`` 进行 XOR。一般情况下不需要提供此参数。 + - 当前字典包含了缓存字典中的所有字段,且在两个字典中,这些键对应的值也是一致的 + - 当前字典中缓存字典没有的字段,其值为默认值(空值) + + 如果以上条件中的任意一条未被满足,转而返回一个根据当前对象重新生成的 163key。 + Args: + tag_key: 歌曲信息密钥,用于加密 163key + return_cached_first: 在满足特定条件时,返回缓存的 163key,而不是重新生成一个 """ - ncm_tag_bytestr = json.dumps(asdict(self), ensure_ascii=False).encode('utf-8') - ncm_tag_bytestr_encrypted = StreamedAESWithModeECB(_TAG_KEY).encrypt(b'music:' + ncm_tag_bytestr) - target = b"163 key(Don't modify):" + b64encode(ncm_tag_bytestr_encrypted) + if tag_key is None: + tag_key = b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28' + else: + tag_key = tobytes(tag_key) - if with_xor: - return bytestrxor(b'c' * len(target), target) + if return_cached_first and self._orig_ncm_tag: + target_ncm_tag = {_ck: _cv for _ck, _cv in asdict(self).items() if _cv or _ck in self._orig_ncm_tag} else: - return target + target_ncm_tag = asdict(self) + + def operation() -> bytes: + ncm_tag_serialized = json.dumps(target_ncm_tag, ensure_ascii=False).encode('utf-8') + ncm_tag_serialized_encrypted = StreamedAESWithModeECB(tag_key).encrypt(b'music:' + ncm_tag_serialized) + ncm_163key = b"163 key(Don't modify):" + b64encode(ncm_tag_serialized_encrypted) + + return ncm_163key + + if return_cached_first and tag_key == self._orig_tag_key and self._orig_ncm_tag and self._orig_ncm_163key: + if len(target_ncm_tag) == len(self._orig_ncm_tag): + for k, v in self._orig_ncm_tag.items(): + if target_ncm_tag[k] != v: + break + else: + return tobytes(self._orig_ncm_163key) + + return operation() - def to_mutagen_style_dict(self) -> MutagenStyleDict: - """根据当前对象储存的解析结果,构建并返回一个 Mutagen VorbisComment/ID3 风格的字典。 + def to_mutagen_style_dict(self): + """(已弃用,且将会在后续版本中删除。请尽快使用 + ``CloudMusicIdentifier.to_mutagen_tag()`` 代替,以便极大简化步骤。) + + 根据当前对象储存的解析结果,构建并返回一个 Mutagen VorbisComment/ID3 风格的字典。 此方法需要当前对象的 ``format`` 属性来决定构建何种风格的字典, 并且只支持 ``'flac'`` (VorbisComment) 和 ``'mp3'`` (ID3)。 @@ -134,7 +270,15 @@ def to_mutagen_style_dict(self) -> MutagenStyleDict: >>> mutagen_mp3.save() >>> """ - comment = self.to_ncm_163key(with_xor=False).decode('utf-8') + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.to_mutagen_style_dict() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.to_mutagen_tag() instead.' + ) + ) + + comment = self.to_ncm_163key().decode('utf-8') if not isinstance(self.format, str): raise TypeError(f"'self.format' must be str, not {type(self.format)}") elif self.format.lower() == 'flac': @@ -159,7 +303,87 @@ def to_mutagen_style_dict(self) -> MutagenStyleDict: return ret -class NCM(CryptLayerWrappedIOSkel): +class NCMFileInfo(NamedTuple): + """用于储存 NCM 文件的信息。""" + master_key_encrypted: bytes + ncm_163key: bytes + cipher_ctor: Callable[[...], ARC4] + cipher_data_offset: int + cipher_data_len: int + cover_data_offset: int + cover_data_len: int + + +def probe_ncm(filething: FilePath | IO[bytes], /) -> tuple[Path | IO[bytes], NCMFileInfo | None]: + """探测源文件 ``filething`` 是否为一个 NCM 文件。 + + 返回一个 2 个元素长度的元组:第一个元素为 ``filething``;如果 + ``filething`` 是 NCM 文件,那么第二个元素为一个 ``NCMFileInfo`` 对象;否则为 ``None``。 + + 本方法的返回值可以用于 ``NCM.open()`` 的第一个位置参数。 + + Args: + filething: 源文件的路径或文件对象 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 NCM 文件,那么第二个元素为一个 NCMFileInfo 对象;否则为 None。 + """ + + def operation(fd: IO[bytes]) -> NCMFileInfo | None: + fd.seek(0, 0) + + if not fd.read(10).startswith(b'CTENFDAM'): + return + + master_key_encrypted_xored_len = int.from_bytes(fd.read(4), 'little') + master_key_encrypted_xored = fd.read(master_key_encrypted_xored_len) + master_key_encrypted = bytestrxor(b'd' * master_key_encrypted_xored_len, + master_key_encrypted_xored + ) + + ncm_163key_xored_len = int.from_bytes(fd.read(4), 'little') + ncm_163key_xored = fd.read(ncm_163key_xored_len) + ncm_163key = bytestrxor(b'c' * ncm_163key_xored_len, ncm_163key_xored) + + fd.seek(5, 1) + + cover_space_len = int.from_bytes(fd.read(4), 'little') + cover_data_len = int.from_bytes(fd.read(4), 'little') + if cover_space_len - cover_data_len < 0: + raise CrypterCreatingError(f'file structure error: ' + f'cover space length ({cover_space_len}) ' + f'< cover data length ({cover_data_len})' + ) + cover_data_offset = fd.tell() + cipher_data_offset = fd.seek(cover_space_len, 1) + cipher_data_len = fd.seek(0, 2) - cipher_data_offset + + return NCMFileInfo( + master_key_encrypted=master_key_encrypted, + ncm_163key=ncm_163key, + cipher_ctor=ARC4, + cipher_data_offset=cipher_data_offset, + cipher_data_len=cipher_data_len, + cover_data_offset=cover_data_offset, + cover_data_len=cover_data_len + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + return Path(filething), operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_origpos = fileobj.tell() + prs = operation(fileobj) + fileobj.seek(fileobj_origpos, 0) + + return fileobj, prs + + +class NCM(EncryptedBytesIOSkel): """基于 BytesIO 的 NCM 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -169,103 +393,18 @@ class NCM(CryptLayerWrappedIOSkel): 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 NCM 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``NCM.new()`` 和 ``NCM.from_file()`` 新建或打开已有 NCM 文件, - 使用已有 NCM 对象的 ``self.to_file()`` 方法将其保存到文件。 + ``NCM.new()`` 和 ``NCM.open()`` 新建或打开已有 NCM 文件, + 使用已有 NCM 对象的 ``save()`` 方法将其保存到文件。 """ - @property - def cipher(self) -> ARC4: - return self._cipher - - @property - def master_key(self) -> bytes: - return self.cipher.master_key - - @property - def core_key(self) -> bytes: - return self._core_key - - @core_key.setter - def core_key(self, value: BytesLike) -> None: - self._core_key = tobytes(value) - - @core_key.deleter - def core_key(self) -> None: - self._core_key = None - - @property - def ncm_tag(self) -> CloudMusicIdentifier: - return self._ncm_tag - - @property - def cover_data(self) -> bytes: - return self._cover_data - - @cover_data.setter - def cover_data(self, value: BytesLike) -> None: - self._cover_data = tobytes(value) - - @cover_data.deleter - def cover_data(self) -> None: - self._cover_data = b'' - - def __init__(self, - cipher: ARC4, /, - initial_bytes: BytesLike = b'', - core_key: BytesLike = None, *, - ncm_tag: CloudMusicIdentifier | Mapping[str, Any] | Iterable[tuple[str, Any]] = None, - cover_data: BytesLike = b'' - ) -> None: - """基于 BytesIO 的 NCM 透明加密二进制流。 - - 所有读写相关方法都会经过透明加密层处理: - 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 - - 调用读写相关方法时,附加参数 ``nocryptlayer=True`` - 可绕过透明加密层,访问缓冲区内的原始加密数据。 - - 如果你要新建一个 NCM 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``NCM.new()`` 和 ``NCM.from_file()`` 新建或打开已有 NCM 文件, - 使用已有 NCM 对象的 ``self.to_file()`` 方法将其保存到文件。 - """ - if core_key is None: - self._core_key = None - else: - self._core_key = tobytes(core_key) - if ncm_tag is None: - ncm_tag = CloudMusicIdentifier() - elif not isinstance(ncm_tag, CloudMusicIdentifier): - ncm_tag = CloudMusicIdentifier(**ncm_tag) - self._ncm_tag: CloudMusicIdentifier = ncm_tag - self._cover_data: bytes = tobytes(cover_data) - super().__init__(cipher, initial_bytes) - if not isinstance(self._cipher, ARC4): - raise TypeError(f"'{type(self).__name__}' " - f"only support cipher '{ARC4.__module__}.{ARC4.__name__}', " - f"got '{type(self._cipher).__name__}'" - ) - - @classmethod - def new(cls, - core_key: BytesLike = None, *, - ncm_tag: CloudMusicIdentifier | Mapping[str, Any] | Iterable[tuple[str, Any]] = None, - cover_data: BytesLike = b'' - ) -> NCM: - """创建一个空的 NCM 对象。""" - master_key = (make_random_number_string(29) + make_random_ascii_string(84)).encode('utf-8') - - return cls(ARC4(master_key), - core_key=core_key, - ncm_tag=ncm_tag, - cover_data=cover_data - ) - @classmethod def from_file(cls, ncm_filething: FilePath | IO[bytes], /, core_key: BytesLike, - ) -> NCM: - """打开一个已有的 NCM 文件 ``ncm_filething``。 + ): + """(已弃用,且将会在后续版本中删除。请尽快使用 ``NCM.open()`` 代替。) + + 打开一个已有的 NCM 文件 ``ncm_filething``。 第一个位置参数 ``ncm_filething`` 可以是 ``str``、``bytes`` 或任何拥有 ``__fspath__`` 属性的路径对象。``ncm_filething`` 也可以是文件对象,该对象必须可读和可跳转 @@ -275,61 +414,131 @@ def from_file(cls, 核心密钥 ``core_key`` 是第二个参数,用于解密找到的主密钥。 """ + warnings.warn( + DeprecationWarning( + f'{cls.__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {cls.__name__}.open() instead.' + ) + ) + return cls.open(ncm_filething, core_key=core_key) - def operation(fileobj: IO[bytes]) -> NCM: - if not fileobj.read(10).startswith(b'CTENFDAM'): - raise ValueError(f"{fileobj} is not a NCM file") - - master_key_encrypted_xored_len = int.from_bytes(fileobj.read(4), 'little') - master_key_encrypted_xored = fileobj.read(master_key_encrypted_xored_len) - master_key_encrypted = bytestrxor(b'd' * master_key_encrypted_xored_len, - master_key_encrypted_xored - ) - master_key = StreamedAESWithModeECB(core_key).decrypt(master_key_encrypted)[17:] # 去除开头的 b'neteasecloudmusic' - cipher = ARC4(master_key) + @classmethod + def open(cls, + filething_or_info: tuple[Path | IO[bytes], NCMFileInfo | None] | FilePath | IO[bytes], /, + core_key: BytesLike = None, + tag_key: BytesLike = None, + master_key: BytesLike = None + ): + """打开一个 NCM 文件,并返回一个 ``NCM`` 对象。 + + 第一个位置参数 ``filething_or_info`` 需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + ``filething_or_info`` 也可以接受 ``probe_ncm()`` 函数的返回值: + 一个包含两个元素的元组,第一个元素是源文件的路径或文件对象,第二个元素是源文件的信息。 + + 第二个参数 ``core_key`` 一般情况下是必需的,用于解密文件内嵌的主密钥。 + 例外:如果你提供了第四个参数 ``master_key``,那么它是可选的。 + + 第三个参数 ``tag_key`` 可选,用于解密文件内嵌的歌曲信息。如果留空,则使用默认值: + ``b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28'`` + + 第四个参数 ``master_key`` 可选,如果提供,将会被作为主密钥使用, + 而文件内置的主密钥会被忽略,``core_key`` 也不再是必需参数。 + 一般不需要填写此参数,因为 NCM 文件总是内嵌加密的主密钥,从而可以轻松地获得。 + + Args: + filething_or_info: 源文件的路径或文件对象,或者 probe_ncm() 的返回值 + core_key: 核心密钥,用于解密文件内嵌的主密钥 + tag_key: 歌曲信息密钥,用于解密文件内嵌的歌曲信息 + master_key: 如果提供,将会被作为主密钥使用,而文件内置的主密钥会被忽略 + Raises: + TypeError: 参数 core_key 和 master_key 都未提供 + """ + if core_key is not None: + core_key = tobytes(core_key) + if tag_key is not None: + tag_key = tobytes(tag_key) + if master_key is not None: + master_key = tobytes(master_key) + if master_key is None and core_key is None: + raise TypeError( + f"{cls.__name__}.open() missing 1 argument: 'core_key'" + ) + + def operation(fd: IO[bytes]) -> cls: + if master_key is None: + target_master_key = StreamedAESWithModeECB(core_key).decrypt( + fileinfo.master_key_encrypted + )[17:] # 去除开头的 b'neteasecloudmusic' + else: + target_master_key = master_key + cipher = fileinfo.cipher_ctor(target_master_key) - ncm_163key_xored_len = int.from_bytes(fileobj.read(4), 'little') - ncm_163key_xored = fileobj.read(ncm_163key_xored_len) try: - ncm_tag = CloudMusicIdentifier.from_ncm_163key(ncm_163key_xored, is_xored=True) + ncm_tag = CloudMusicIdentifier.from_ncm_163key( + fileinfo.ncm_163key, + tag_key=tag_key + ) except Exception as exc: warnings.warn(f'skip parsing 163key, because an exception was raised while parsing: ' f'{type(exc).__name__}: {exc}', CrypterCreatingWarning ) - warnings.warn(f"you may need to check if the file {repr(ncm_filething)} " + warnings.warn(f"you may need to check if the file {repr(filething)} " f"is corrupted.", CrypterCreatingWarning ) - ncm_tag = None + ncm_tag = CloudMusicIdentifier() - fileobj.seek(5, 1) + fd.seek(fileinfo.cover_data_offset, 0) + cover_data = fd.read(fileinfo.cover_data_len) - cover_space_len = int.from_bytes(fileobj.read(4), 'little') - cover_data_len = int.from_bytes(fileobj.read(4), 'little') - if cover_space_len - cover_data_len < 0: - raise CrypterCreatingError(f'file structure error: ' - f'cover space length ({cover_space_len}) ' - f'< cover data length ({cover_data_len})' - ) - cover_data = fileobj.read(cover_data_len) - fileobj.seek(cover_space_len - cover_data_len, 1) + fd.seek(fileinfo.cipher_data_offset, 0) + initial_bytes = fd.read(fileinfo.cipher_data_len) - audio_encrypted = fileobj.read() + inst = cls(cipher, initial_bytes) + inst._cover_data = cover_data + inst._ncm_tag = ncm_tag - return cls(cipher, audio_encrypted, ncm_tag=ncm_tag, cover_data=cover_data, core_key=core_key) + return inst - if is_filepath(ncm_filething): - with open(ncm_filething, mode='rb') as ncm_fileobj: - instance = operation(ncm_fileobj) + if isinstance(filething_or_info, tuple): + filething_or_info: tuple[Path | IO[bytes], NCMFileInfo | None] + if len(filething_or_info) != 2: + raise TypeError( + "first argument 'filething_or_info' must be a file path, a file object, " + "or a tuple of probe_ncm() returns" + ) + filething, fileinfo = filething_or_info else: - ncm_fileobj = verify_fileobj(ncm_filething, 'binary', - verify_readable=True, - verify_seekable=True - ) - instance = operation(ncm_fileobj) + filething, fileinfo = probe_ncm(filething_or_info) + + if fileinfo is None: + raise CrypterCreatingError( + f"{repr(filething)} is not a NCM file" + ) + elif not isinstance(fileinfo, NCMFileInfo): + raise TypeError( + f"second element of the tuple must be NCMFileInfo or None, not {type(fileinfo).__name__}" + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + instance = operation(fileobj) + instance._name = Path(filething) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_sourcefile = getattr(fileobj, 'name', None) + instance = operation(fileobj) - instance._name = getattr(ncm_fileobj, 'name', None) + if fileobj_sourcefile is not None: + instance._name = Path(fileobj_sourcefile) return instance @@ -337,7 +546,9 @@ def to_file(self, ncm_filething: FilePath | IO[bytes] = None, /, core_key: BytesLike = None ) -> None: - """将当前 NCM 对象保存到文件 ``filething``。 + """(已弃用,且将会在后续版本中删除。请尽快使用 ``NCM.save()`` 代替。) + + 将当前 NCM 对象保存到文件 ``filething``。 此过程会向 ``ncm_filething`` 写入 NCM 文件结构。 第一个位置参数 ``ncm_filething`` 可以是 ``str``、``bytes`` 或任何拥有 ``__fspath__`` @@ -352,47 +563,211 @@ def to_file(self, 的文件对象,将数据写入此文件对象。如果两者都为空或未提供,则会触发 ``ValueError``。 """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.save() instead.' + ) + ) + if not core_key: + core_key = self.core_key + return self.save(core_key, filething=ncm_filething) + + def save(self, + core_key: BytesLike, + filething: FilePath | IO[bytes] = None, + tag_key: BytesLike | None = None + ) -> None: + """将当前对象保存为一个新 NCM 文件。 + + 第一个参数 ``core_key`` 是必需的,用于加密主密钥,以便嵌入到文件。 + + 第二个参数 ``filething`` 是可选的,如果提供此参数,需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + 如果未提供此参数,那么将会尝试使用当前对象的 ``source`` 属性;如果后者也不可用,则引发 + ``TypeError``。 + + 第三个参数 ``tag_key`` 可选,用于加密歌曲信息,以便嵌入到文件。如果留空,则使用默认值: + ``b'\x23\x31\x34\x6c\x6a\x6b\x5f\x21\x5c\x5d\x26\x30\x55\x3c\x27\x28'`` + + Args: + core_key: 核心密钥,用于加密主密钥,以便嵌入到文件 + filething: 目标文件的路径或文件对象 + tag_key: 歌曲信息密钥,用于加密歌曲信息,以便嵌入到文件 + """ + core_key = tobytes(core_key) + if tag_key is not None: + tag_key = tobytes(tag_key) - def operation(fileobj: IO[bytes]) -> None: - fileobj.write(b'CTENFDAM') - fileobj.write(token_bytes(2)) + def operation(fd: IO[bytes]) -> None: + fd.seek(0, 0) - master_key_encrypted = StreamedAESWithModeECB(core_key).encrypt(b'neteasecloudmusic' + self.cipher.master_key) + fd.write(b'CTENFDAM') + fd.seek(2, 1) + + master_key = self.master_key + master_key_encrypted = StreamedAESWithModeECB(core_key).encrypt(b'neteasecloudmusic' + master_key) master_key_encrypted_xored = bytestrxor(b'd' * len(master_key_encrypted), master_key_encrypted) - master_key_encrypted_xored_len = len(master_key_encrypted_xored).to_bytes(4, 'little') - fileobj.write(master_key_encrypted_xored_len) - fileobj.write(master_key_encrypted_xored) - - ncm_163key_xored = self.ncm_tag.to_ncm_163key(with_xor=True) - ncm_163key_xored_len = len(ncm_163key_xored).to_bytes(4, 'little') - fileobj.write(ncm_163key_xored_len) - fileobj.write(ncm_163key_xored) - - fileobj.write(token_bytes(5)) - - cover_space_len = len(self.cover_data).to_bytes(4, 'little') - cover_data_len = cover_space_len - fileobj.write(cover_space_len) - fileobj.write(cover_data_len) - fileobj.write(self.cover_data) - - fileobj.write(self.getvalue(nocryptlayer=True)) - - if core_key is None: - if self.core_key is None: - raise CrypterSavingError('core key missing: ' - "argument 'core_key' and attribute 'self.core_key' " - "are None or unspecified" - ) - core_key = self.core_key + master_key_encrypted_xored_len = len(master_key_encrypted_xored) + fd.write(master_key_encrypted_xored_len.to_bytes(4, 'little')) + fd.write(master_key_encrypted_xored) + + ncm_163key = self.ncm_tag.to_ncm_163key(tag_key) + ncm_163key_xored = bytestrxor(b'c' * len(ncm_163key), ncm_163key) + ncm_163key_xored_len = len(ncm_163key_xored) + fd.write(ncm_163key_xored_len.to_bytes(4, 'little')) + fd.write(ncm_163key_xored) + + fd.write(token_bytes(5)) + + cover_data = self.cover_data if self.cover_data else b'' + cover_data_len = len(cover_data) + fd.write(cover_data_len.to_bytes(4, 'little')) # cover_space length + fd.write(cover_data_len.to_bytes(4, 'little')) # cover_data length + fd.write(cover_data) + + fd.write(self.getvalue(nocryptlayer=True)) + + if filething is None: + if self.source is None: + raise TypeError( + "attribute 'self.source' and argument 'filething' are empty, " + "don't know which file to save to" + ) + filething = self.source + + if isfilepath(filething): + with open(filething, mode='wb') as fileobj: + return operation(fileobj) else: - core_key = tobytes(core_key) + fileobj = verify_fileobj(filething, 'binary', + verify_seekable=True, + verify_writable=True + ) + return operation(fileobj) - if is_filepath(ncm_filething): - with open(ncm_filething, mode='wb') as ncm_fileobj: - operation(ncm_fileobj) - else: - ncm_fileobj = verify_fileobj(ncm_filething, 'binary', - verify_writable=True - ) - operation(ncm_fileobj) + @classmethod + def new(cls): + """返回一个空 NCM 对象。""" + master_key = (make_random_number_string(29) + make_random_ascii_string(84)).encode('utf-8') + + return cls(ARC4(master_key)) + + @property + def acceptable_ciphers(self) -> list[Type[ARC4]]: + return [ARC4] + + def __init__(self, cipher: ARC4, /, initial_bytes=b''): + """基于 BytesIO 的 NCM 透明加密二进制流。 + + 所有读写相关方法都会经过透明加密层处理: + 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 + + 调用读写相关方法时,附加参数 ``nocryptlayer=True`` + 可绕过透明加密层,访问缓冲区内的原始加密数据。 + + 如果你要新建一个 NCM 对象,不要直接调用 ``__init__()``,而是使用构造器方法 + ``NCM.new()`` 和 ``NCM.open()`` 新建或打开已有 NCM 文件, + 使用已有 NCM 对象的 ``save()`` 方法将其保存到文件。 + + Args: + cipher: 要使用的 cipher,必须是一个 libtakiyasha.stdciphers.ARC4 对象 + initial_bytes: 内置缓冲区的初始数据 + """ + super().__init__(cipher, initial_bytes=initial_bytes) + + self._cover_data: bytes | None = None + self._ncm_tag: CloudMusicIdentifier = CloudMusicIdentifier() + self._sourcefile: Path | None = None + self._core_key_deprecated: bytes | None = None + + @property + def core_key(self) -> bytes | None: + """(已弃用,且将会在后续版本中删除。) + + 核心密钥,用于加/解密主密钥。 + + ``NCM.from_file()`` 会在当前对象被创建时设置此属性;而 ``NCM.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + return self._core_key_deprecated + + @core_key.setter + def core_key(self, value: BytesLike) -> None: + """(已弃用,且将会在后续版本中删除。) + + 核心密钥,用于加/解密主密钥。 + + ``NCM.from_file()`` 会在当前对象被创建时设置此属性;而 ``NCM.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'core_key'. " + f"Use `del self.core_key` instead" + ) + self._core_key_deprecated = tobytes(value) + + @core_key.deleter + def core_key(self) -> None: + """(已弃用,且将会在后续版本中删除。) + + 核心密钥,用于加/解密主密钥。 + + ``NCM.from_file()`` 会在当前对象被创建时设置此属性;而 ``NCM.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + self._core_key_deprecated = None + + @property + def cover_data(self) -> bytes | None: + """封面图像数据。""" + return self._cover_data + + @cover_data.setter + def cover_data(self, value: BytesLike) -> None: + """封面图像数据。""" + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'cover_data'. " + f"Use `del self.cover_data` instead" + ) + self._cover_data = tobytes(value) + + @cover_data.deleter + def cover_data(self) -> None: + """封面图像数据。""" + self._cover_data = None + + @property + def ncm_tag(self) -> CloudMusicIdentifier: + """163key 的解析结果。""" + return self._ncm_tag + + @ncm_tag.setter + def ncm_tag(self, value: CloudMusicIdentifier) -> None: + """163key 的解析结果。""" + if not isinstance(value, CloudMusicIdentifier): + raise TypeError( + f"attribute 'ncm_tag' must be CloudMusicIdentifier, not {type(value).__name__}" + ) diff --git a/src/libtakiyasha/prototypes.py b/src/libtakiyasha/prototypes.py new file mode 100644 index 0000000..1aadb61 --- /dev/null +++ b/src/libtakiyasha/prototypes.py @@ -0,0 +1,993 @@ +# -*- coding: utf-8 -*- +from __future__ import annotations + +from abc import ABCMeta, abstractmethod +from functools import lru_cache +from pathlib import Path +from random import randint +from typing import Callable, Generator, Iterable, Iterator, Literal, Type + +try: + import io +except ImportError: + import _pyio as io + +from .typedefs import IntegerLike, BytesLike, KeyStreamBasedStreamCipherProto, StreamCipherProto, WritableBuffer +from .typeutils import tobytes, toint + +__all__ = [ + 'CipherSkel', + 'KeyStreamBasedStreamCipherSkel', + 'CryptLayerWrappedIOSkel', + 'EncryptedBytesIOSkel' +] + + +class CipherSkel(metaclass=ABCMeta): + """适用于一般加密算法的框架类。子类必须实现 ``encrypt()``、``decrypt()`` + 和 ``getkey()`` 方法。 + + 如果以上方法中的任何一个无法实现,应当在被调用时抛出 ``NotImplementedError``。 + """ + + @abstractmethod + def getkey(self, keyname: str = 'master') -> bytes | None: + raise NotImplementedError + + @abstractmethod + def encrypt(self, plaindata: BytesLike, /) -> bytes: + """加密明文 ``plaindata`` 并返回加密结果。 + + Args: + plaindata: 要加密的明文 + """ + raise NotImplementedError + + @abstractmethod + def decrypt(self, cipherdata: BytesLike, /) -> bytes: + """解密密文 ``cipherdata`` 并返回解密结果。 + + Args: + cipherdata: 要解密的密文 + """ + raise NotImplementedError + + +class KeyStreamBasedStreamCipherSkel(metaclass=ABCMeta): + """适用于简单流式加密算法的框架类。子类必须实现 ``keystream()`` 和 ``getkey()`` 方法。 + + 以下方法的实现是可选的,但如果实现了,就会被 ``encrypt()`` 和 ``decrypt()`` 使用: + + - ``prexor_encrypt()`` - 加密前对明文的预处理,被 ``encrypt()`` 使用 + - ``postxor_encrypt()`` - 加密后对密文的后处理,被 ``encrypt()`` 使用 + - ``prexor_decrypt()`` - 解密前对密文的预处理,被 ``decrypt()`` 使用 + - ``postxor_decrypt()`` - 解密后对明文的后处理,被 ``decrypt()`` 使用 + + 以上可选方法的实现必须接受一个类字节对象和一个整数,并返回一个由整数组成的可迭代对象。 + """ + + @abstractmethod + def getkey(self, keyname: str = 'master') -> bytes | None: + raise NotImplementedError + + @abstractmethod + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + """返回一个生成器对象,对其进行迭代,即可得到从起始点 + ``offset`` 开始,持续一定长度 ``nbytes`` 的密钥流。 + + Args: + operation: 针对特定的操作生成密钥流(加密 encrypt 和解密 decrypt) + offset: 密钥流的起始点,不应为负数 + nbytes: 密钥流的长度,不应为负数 + """ + raise NotImplementedError + + def encrypt(self, plaindata: BytesLike, offset: IntegerLike = 0, /) -> bytes: + """加密明文 ``plaindata`` 并返回加密结果。 + + Args: + plaindata: 要加密的明文 + offset: 明文在文件中的位置(偏移量),不应为负数 + """ + plaindata = tobytes(plaindata) + offset = toint(offset) + + prexor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self, 'prexor_encrypt', None) + postxor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self, 'postxor_encrypt', None) + keystream = self.keystream('encrypt', len(plaindata), offset) + + if prexor: + pd_strm = prexor(plaindata, offset) + else: + pd_strm = plaindata + cd_noxor_strm = (pd_byte ^ ks_byte for pd_byte, ks_byte in zip(pd_strm, keystream)) + if postxor: + cd_strm = postxor(cd_noxor_strm, offset) + else: + cd_strm = cd_noxor_strm + + return bytes(cd_strm) + + def decrypt(self, cipherdata: BytesLike, offset: IntegerLike = 0, /) -> bytes: + """解密密文 ``cipherdata`` 并返回解密结果。 + + Args: + cipherdata: 要解密的密文 + offset: 密文在文件中的位置(偏移量),不应为负数 + """ + cipherdata = tobytes(cipherdata) + offset = toint(offset) + + prexor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self, 'prexor_decrypt', None) + postxor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self, 'postxor_decrypt', None) + keystream = self.keystream('decrypt', len(cipherdata), offset) + + if prexor: + cd_strm = prexor(cipherdata, offset) + else: + cd_strm = cipherdata + pd_noxor_strm = (cd_byte ^ ks_byte for cd_byte, ks_byte in zip(cd_strm, keystream)) + if postxor: + pd_strm = postxor(pd_noxor_strm, offset) + else: + pd_strm = pd_noxor_strm + + return bytes(pd_strm) + + +class CryptLayerWrappedIOSkel(io.BytesIO): + """基于 BytesIO 的透明加密二进制流。 + + 所有读写相关方法都会经过透明加密层处理: + 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 + + 调用读写相关方法时,附加参数 ``nocryptlayer=True`` + 可绕过透明加密层,访问缓冲区内的原始加密数据。 + + ``__init__()`` 方法的第一个位置参数 ``cipher`` 必须拥有 + ``encrypt()``、``decrypt()`` 和 ``keystream()`` 方法,且这些方法必须能接受两个位置参数。 + 其中,``encrypt()`` 和 ``decrypt()`` 的第一个位置参数接受字节对象,第二个位置参数接受非负整数; + ``keystream()`` 的两个位置参数均只接受非负整数。 + + 如果 ``cipher`` 未实现这些方法中的任何一个,都需要明确抛出 ``NotImplementedError``。 + 未实现的 ``encrypt()``/``decrypt()`` 方法会导致创建的对象不可通过透明加密层读/写; + 未实现的 ``keystream()`` 方法不会影响对读写的支持,但可能会极大影响读取的速度。 + + ``__init__()`` 方法的第二个参数 ``initial_bytes`` + 会在转换为 ``bytes`` 后作为对象内置缓冲区的初始数据。 + + 基于本类的子类可能拥有自己的构造器方法或函数,而不是直接调用 + ``__init__()``;详情请参考该类的文档字符串。 + + 本类和基于本类的子类,同时兼容 ``IO[bytes]`` + 和 ``typedefs.StreamCipherBasedCryptedIOProto`` 类型。 + """ + + @property + def name(self) -> str | None: + """当前对象来源文件的路径。 + + 在此类的对象中,此属性总是 ``None``。 + + 如果是通过子类的构造器方法或函数创建的对象,此属性可能会为来源文件的路径字符串。 + """ + if hasattr(self, '_name'): + name: str = self._name + return name + + @property + def encryptable(self) -> bool: + """此对象的内置透明加密层是否支持加密(内置 ``Cipher`` 对象的 ``encrypt()`` 方法是否可用)。 + + 这会影响到写入相关方法在参数 ``nocryptlayer=False`` 时是否可用。 + """ + return self._encrypt_available + + @property + def decryptable(self) -> bool: + """此对象的内置透明加密层是否支持解密(内置 ``Cipher`` 对象的 ``decrypt()`` 方法是否可用)。 + + 这会影响到读取相关方法在参数 ``nocryptlayer=False`` 时是否可用,以及此对象是否可迭代。 + """ + return self._decrypt_available + + @property + def iter_nocryptlayer(self) -> bool: + """迭代当前对象时,是否需要绕过透明加密层。默认为 ``False``。""" + return self._iter_nocryptlayer + + @iter_nocryptlayer.setter + def iter_nocryptlayer(self, value: bool): + self._iter_nocryptlayer = bool(value) + + @property + def iter_mode(self) -> Literal['block', 'line']: + """迭代的模式,只能设置为 ``block`` 或 ``line``: + + - ``block``(默认值)- 以块为单位进行迭代:每次迭代时,返回等长的“一块”数据。 + - 每次迭代返回的数据长度由 ``self.iter_block_size`` 决定。 + - ``line`` - 以一行为单位进行迭代:每次迭代时,返回的数据都以 ``b'\\n'`` 结尾。 + - 此模式会极大降低迭代的速度,不推荐使用。 + + 尝试设置为其他值会触发 ``ValueError`` 或 ``TypeError``。 + """ + return self._iter_mode + + @iter_mode.setter + def iter_mode(self, value: Literal['block', 'line']) -> None: + if value in ('block', 'line'): + self._iter_mode = value + elif isinstance(value, str): + raise ValueError(f"attribute 'iter_mode' must be 'block' or 'line', not '{value}'") + else: + raise TypeError(f"attribute 'iter_mode' must be str, not {type(value).__name__}") + + @property + def iter_block_size(self) -> int: + """以块为单位进行迭代时,每次迭代返回的数据长度。 + + 如果尝试设置为负数,会触发 ``ValueError``。 + + 本属性不会影响以一行为单位进行的迭代。 + """ + return self._iter_block_size + + @iter_block_size.setter + def iter_block_size(self, value: IntegerLike) -> None: + size = toint(value) + if size < 0: + raise ValueError("attribute 'iter_block_size' cannot be a negative integer") + self._iter_block_size = size + + def __init__(self, cipher, /, initial_bytes: BytesLike = b'') -> None: + """基于 BytesIO 的透明加密二进制流。 + + 所有读写相关方法都会经过透明加密层处理: + 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 + + 调用读写相关方法时,附加参数 ``nocryptlayer=True`` + 可绕过透明加密层,访问缓冲区内的原始加密数据。 + + ``__init__()`` 方法的第一个位置参数 ``cipher`` 必须拥有 + ``encrypt()``、``decrypt()`` 和 ``keystream()`` 方法,且这些方法必须能接受两个位置参数。 + 其中,``encrypt()`` 和 ``decrypt()`` 的第一个位置参数接受字节对象,第二个位置参数接受非负整数; + ``keystream()`` 的两个位置参数均只接受非负整数。 + + 如果 ``cipher`` 未实现这些方法中的任何一个,都需要明确抛出 ``NotImplementedError``。 + 未实现的 ``encrypt()``/``decrypt()`` 方法会导致创建的对象不可通过透明加密层读/写; + 未实现的 ``keystream()`` 方法不会影响对读写的支持,但可能会极大影响读取的速度。 + + ``__init__()`` 方法的第二个参数 ``initial_bytes`` + 会在转换为 ``bytes`` 后作为对象内置缓冲区的初始数据。 + + 基于本类的子类可能拥有自己的构造器方法或函数,而不是直接调用 + ``__init__()``;详情请参考该类的文档字符串。 + + 本类和基于本类的子类,同时兼容 ``IO[bytes]`` + 和 ``typedefs.StreamCipherBasedCryptedIOProto`` 类型。 + """ + super().__init__(tobytes(initial_bytes)) + + for method_name in 'keystream', 'encrypt', 'decrypt': + try: + method = getattr(cipher, method_name) + except Exception as exc: + if hasattr(cipher, method_name): + raise exc + else: + raise TypeError(f"{repr(cipher)} is not a StreamCipher object: " + f"method '{method_name}' is missing" + ) + if not callable(method): + raise TypeError(f"{repr(cipher)} is not a StreamCipher object: " + f"method '{method_name}' is not callable" + ) + # 检测 keystream() 是否已实现(可用) + self._keystream_available = True + try: + cipher.keystream(0, 0) + except NotImplementedError: + self._keystream_available = False + # 检测 encrypt() 是否已实现(可用) + self._encrypt_available = True + try: + cipher.encrypt(b'', 0) + except NotImplementedError: + self._encrypt_available = False + # 检测 decrypt() 是否已实现(可用) + self._decrypt_available = True + try: + cipher.decrypt(b'', 0) + except NotImplementedError: + self._decrypt_available = False + + self._cipher = cipher + self._iter_nocryptlayer = False + self._iter_mode: Literal['block', 'line'] = 'block' + self._iter_block_size: int = io.DEFAULT_BUFFER_SIZE + + def __iter__(self): + return self + + def __next__(self) -> bytes: + if self._iter_mode == 'line': + if self._iter_nocryptlayer: + return super().__next__() + elif not self._decrypt_available: + raise io.UnsupportedOperation('iter with crypt layer') + else: + curpos = self.tell() + + target_data = super().getvalue()[curpos:] + if self._keystream_available: + result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=b'\n')) + else: + result_data = bytearray() + start = curpos + while 1: + stop = start + self._iter_block_size + target_data_segment = target_data[start:stop] + if target_data_segment == b'': + break + d = self._cipher.decrypt(target_data_segment, start) + if b'\n' in d: + result_data.append(d[:d.index(b'\n')]) + break + else: + result_data.append(d) + start += self._iter_block_size + if result_data == b'': + raise StopIteration + + self.seek(curpos + len(result_data), 0) + + return result_data + elif self._iter_mode == 'block': + if not self._decrypt_available: + raise io.UnsupportedOperation('iter with crypt layer') + + curpos = self.tell() + + target_data = super().getvalue()[curpos:curpos + self._iter_block_size] + if self._iter_nocryptlayer: + result_data = target_data + elif self._keystream_available: + result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=None)) + else: + result_data = bytes(self._cipher.decrypt(target_data, curpos)) + + if result_data == b'': + raise StopIteration + + self.seek(curpos + len(result_data), 0) + + return result_data + elif isinstance(self._iter_mode, str): + raise ValueError(f"attribute 'iter_mode' must be 'block' or 'line', not '{self._iter_mode}'") + else: + raise TypeError(f"attribute 'iter_mode' must be str, not {type(self._iter_mode).__name__}") + + @lru_cache + def __repr__(self) -> str: + repr_strings = [ + f'<{type(self).__module__}.{type(self).__name__} object', + f' at {hex(id(self))}', + f', cipher={repr(self._cipher)}' + ] + if self.name is not None: + repr_strings.append(f", from '{self.name}'") + repr_strings.append('>') + + return ''.join(repr_strings) + + def _xor_data_keystream(self, + offset: int, + data: bytes, + eof: bytes = None + ) -> Generator[int, None, None]: + if eof is None: + eoford = None + else: + eoford = ord(tobytes(eof)) + + if data == b'': + return + + keystream = self._cipher.keystream(len(data), offset) + for databyteord, streambyteord in zip(data, keystream): + resultbyteord = databyteord ^ streambyteord + yield resultbyteord + if resultbyteord == eoford: + return + + def getvalue(self, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().getvalue() + elif not self._decrypt_available: + raise io.UnsupportedOperation('getvalue with crypt layer') + else: + return self._cipher.decrypt(super().getvalue()) + + def getbuffer(self, nocryptlayer: bool = False) -> memoryview: + if nocryptlayer: + return super().getbuffer() + else: + raise NotImplementedError('memoryview with crypt layer is not supported') + + def read(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().read(size) + elif not self._decrypt_available: + raise io.UnsupportedOperation('read with crypt layer') + else: + curpos = self.tell() + if size is None: + size = -1 + size = toint(size) + if size < 0: + target_data = super().getvalue()[curpos:] + else: + target_data = super().getvalue()[curpos:curpos + size] + + if self._keystream_available: + result_data = bytes(self._xor_data_keystream(curpos, target_data)) + else: + result_data = self._cipher.decrypt(target_data, curpos) + self.seek(curpos + len(result_data), 0) + + return result_data + + def readinto(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: + if nocryptlayer: + return super().readinto(buffer) + elif not self._decrypt_available: + raise io.UnsupportedOperation('readinto with crypt layer') + else: + if isinstance(buffer, memoryview): + memview = buffer + else: + memview = memoryview(buffer) + memview = memview.cast('B') + + data = self.read(len(memview)) + data_len = len(data) + + memview[:data_len] = data + + return data_len + + def read1(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if nocryptlayer or self._decrypt_available: + return self.read(size, nocryptlayer) + + raise io.UnsupportedOperation('read1 with crypt layer') + + def readinto1(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: + if nocryptlayer or self._decrypt_available: + return self.readinto(buffer, nocryptlayer) + + raise io.UnsupportedOperation('readinto1 with crypt layer') + + def readblock(self, + size: IntegerLike | None = -1, /, + nocryptlayer: bool = False, *, + block_size: IntegerLike | None = io.DEFAULT_BUFFER_SIZE + ) -> bytes: + if not self._decrypt_available: + raise io.UnsupportedOperation('readblock with crypt layer') + + curpos = self.tell() + if size is None: + size = -1 + size = toint(size) + if block_size is None: + block_size = io.DEFAULT_BUFFER_SIZE + block_size = toint(block_size) + if block_size < 0: + block_size = io.DEFAULT_BUFFER_SIZE + if size < 0: + target_data = super().getvalue()[curpos:curpos + block_size] + else: + target_data = super().getvalue()[curpos:curpos + min([size, block_size])] + + if nocryptlayer: + result_data = target_data + elif self._keystream_available: + result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=None)) + else: + result_data = self._cipher.decrypt(target_data, curpos) + + self.seek(curpos + len(result_data), 0) + + return result_data + + def readline(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().readline(size) + else: + if not self._decrypt_available: + raise io.UnsupportedOperation('readline with crypt layer') + curpos = self.tell() + if size is None: + size = -1 + size = toint(size) + if size < 0: + target_data = super().getvalue()[curpos:] + else: + target_data = super().getvalue()[curpos:curpos + size] + + if self._keystream_available: + result_data = bytes(self._xor_data_keystream(curpos, target_data, eof=b'\n')) + else: + result_data = bytearray() + start = curpos + while 1: + stop = start + self._iter_block_size + target_data_segment = target_data[start:stop] + if target_data_segment == b'': + break + d = self._cipher.decrypt(target_data_segment, start) + if b'\n' in d: + result_data.append(d[:d.index(b'\n')]) + break + else: + result_data.append(d) + start += self._iter_block_size + self.seek(curpos + len(result_data), 0) + + return bytes(result_data) + + def readlines(self, hint: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> list[bytes]: + if nocryptlayer: + return super().readlines(hint) + elif not self._decrypt_available: + raise io.UnsupportedOperation('readlines with crypt layer') + else: + results_lines = [] + if hint is None: + hint = -1 + hint = toint(hint) + if hint < 0: + while 1: + line = self.readline() + if line == b'': + return results_lines + results_lines.append(line) + else: + for _ in range(hint): + line = self.readline() + if line == b'': + return results_lines + results_lines.append(line) + + def write(self, data: BytesLike, /, nocryptlayer: bool = False) -> int: + if nocryptlayer: + return super().write(data) + elif not self._encrypt_available: + raise io.UnsupportedOperation('write with crypt layer') + else: + curpos = self.tell() + return super().write(self._cipher.encrypt(data, curpos)) + + def writelines(self, lines: Iterable[BytesLike], /, nocryptlayer: bool = False) -> None: + if nocryptlayer: + return super().writelines(lines) + elif not self._encrypt_available: + raise io.UnsupportedOperation('writelines with crypt layer') + else: + for line in lines: + super().write(line) + + +class EncryptedBytesIOSkel(io.BytesIO): + def __repr__(self) -> str: + reprstr_seq = [ + f'<{self.__module__}.{self.__class__.__name__} at {hex(id(self))}, ' + f'cipher {repr(self.cipher)}' + ] + if self.source is not None: + reprstr_seq.append(f", source '{str(self.source)}'") + reprstr_seq.append('>') + + return ''.join(reprstr_seq) + + @property + def source(self) -> Path | None: + """当前对象来源文件的路径。 + + 在此类的对象中,此属性总是 ``None``。 + + 如果是通过子类的构造器方法或函数创建的对象,此属性可能会为来源文件的路径,使用 ``Path`` 对象表示。 + """ + if hasattr(self, '_name'): + return Path(self._name) + + @property + def cipher(self) -> StreamCipherProto | KeyStreamBasedStreamCipherProto: + """加解密内置缓冲区数据使用的 Cipher 实例。""" + return self._cipher + + @property + def master_key(self) -> bytes | None: + """主密钥。""" + return self._cipher.getkey('master') + + @property + def DEFAULT_BUFFER_SIZE(self) -> int: + """迭代本对象时,每次迭代返回的数据大小。 + + 此值必须是一个非零正整数。使用其他值会引发 ``ValueError``。 + """ + return self._DEFAULT_BUFFER_SIZE + + @DEFAULT_BUFFER_SIZE.setter + def DEFAULT_BUFFER_SIZE(self, value: IntegerLike) -> None: + bufsize = toint(value) + if bufsize <= 0: + raise ValueError(f"attribute 'DEFAULT_BUFFER_SIZE' must greater than 0, got {bufsize}") + self._DEFAULT_BUFFER_SIZE = bufsize + + @property + def ITER_METHOD(self) -> Literal['block', 'line']: + """迭代本对象时使用的迭代模式:按行迭代(``line``)或按固定大小迭代(``block``)。 + + 仅接受以上两个值,使用其他值会引发 ``ValueError``。 + """ + return self._ITER_METHOD + + @ITER_METHOD.setter + def ITER_METHOD(self, value: Literal['block', 'line']) -> None: + if value not in ('block', 'line'): + raise ValueError(f"attribute 'ITER_METHOD' must be 'block' or 'line', not {repr(value)}") + self._ITER_METHOD = value + + @property + def ITER_WITHOUT_CRYPTLAYER(self) -> bool: + """迭代本对象时,是否直接返回加密数据,而不进行解密。""" + return self._ITER_WITHOUT_CRYPTLAYER + + @ITER_WITHOUT_CRYPTLAYER.setter + def ITER_WITHOUT_CRYPTLAYER(self, value: bool) -> None: + self._ITER_WITHOUT_CRYPTLAYER = bool(value) + + @classmethod + def verify_stream_cipher(cls, + cipher: KeyStreamBasedStreamCipherProto | StreamCipherProto + ) -> tuple[bool, bool, bool, bool]: + keystream_encrypt_available = True + keystream_decrypt_available = True + encrypt_available = True + decrypt_available = True + # 验证 keystream() + if isinstance(cipher, KeyStreamBasedStreamCipherProto): + try: + ks = cipher.keystream('encrypt', randint(0, 255), randint(0, 8191)) + except NotImplementedError: + keystream_encrypt_available = False + except Exception as exc: + raise TypeError(f"{repr(cipher)} is not a valid cipher object") from exc + else: + if not all(map(lambda _: isinstance(_, int), ks)): + raise TypeError( + f"result of {repr(cipher)}.keystream() returns non-int during iterating" + ) + try: + ks = cipher.keystream('decrypt', randint(0, 255), randint(0, 8191)) + except NotImplementedError: + keystream_decrypt_available = False + except Exception as exc: + raise TypeError(f"{repr(cipher)} is not a valid cipher object") from exc + else: + if not all(map(lambda _: isinstance(_, int), ks)): + raise TypeError( + f"result of {repr(cipher)}.keystream() returns non-int during iterating" + ) + elif isinstance(cipher, StreamCipherProto): + keystream_encrypt_available = False + keystream_decrypt_available = False + else: + raise TypeError(f"{repr(cipher)} is not a cipher object") + + try: + encrypt_result = cipher.encrypt(bytes([randint(0, 255)]), randint(0, 8191)) + except NotImplementedError: + encrypt_available = False + except Exception as exc: + raise TypeError(f"{repr(cipher)} is not a valid cipher object") from exc + else: + if not isinstance(encrypt_result, (bytes, bytearray)): + raise TypeError( + f"{repr(cipher)}.encrypt() returns non-bytes (type {type(encrypt_result).__name__})" + ) + + try: + decrypt_result = cipher.decrypt(bytes([randint(0, 255)]), randint(0, 8191)) + except NotImplementedError: + decrypt_available = False + except Exception as exc: + raise TypeError(f"{repr(cipher)} is not a valid cipher object") from exc + else: + if not isinstance(decrypt_result, (bytes, bytearray)): + raise TypeError( + f"{repr(cipher)}.decrypt() returns non-bytes (type {type(decrypt_result).__name__})" + ) + + return encrypt_available, decrypt_available, keystream_encrypt_available, keystream_decrypt_available + + @property + def acceptable_ciphers(self) -> list[Type[StreamCipherProto] | Type[KeyStreamBasedStreamCipherProto]]: + """``__init__()`` 的第一个参数 ``cipher`` 可接受的对象类型。 + 在此列表之外的类型会导致 ``__init__()`` 抛出 ``TypeError``。 + """ + return [] + + def __init__(self, + cipher: StreamCipherProto | KeyStreamBasedStreamCipherProto, /, + initial_bytes: BytesLike = b'' + ) -> None: + super().__init__(tobytes(initial_bytes)) + self._encrypt_available, self._decrypt_available, self._keystream_encrypt_available, self._keystream_decrypt_available = self.verify_stream_cipher(cipher) + self._cipher = cipher + if self.acceptable_ciphers: + if not isinstance(self._cipher, tuple(self.acceptable_ciphers)): + if len(self.acceptable_ciphers) == 1: + raise TypeError( + f"first argument 'cipher' must be {self.acceptable_ciphers[0].__name__}, " + f"not {type(self._cipher).__name__}" + ) + else: + supported_ciphers_str = ', '.join( + _.__name__ for _ in self.acceptable_ciphers[:-1] + ) + f'or {self.acceptable_ciphers[-1].__name__}' + raise TypeError( + f"first argument 'cipher' must in {supported_ciphers_str}, " + f"not {type(self._cipher).__name__}" + ) + + self._DEFAULT_BUFFER_SIZE = io.DEFAULT_BUFFER_SIZE + self._ITER_METHOD: Literal['block', 'line'] = 'block' + self._ITER_WITHOUT_CRYPTLAYER = False + + def _iterencrypt(self, plaindata: bytes, offset: int, /) -> Generator[int, None, None]: + if self._keystream_encrypt_available: + prexor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'prexor_encrypt', None) + postxor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'postxor_encrypt', None) + keystream = self._cipher.keystream('encrypt', len(plaindata), offset) + + if prexor: + pd_strm = prexor(plaindata, offset) + else: + pd_strm = plaindata + cd_noxor_strm = (pd_byte ^ ks_byte for pd_byte, ks_byte in zip(pd_strm, keystream)) + if postxor: + cd_strm = postxor(cd_noxor_strm, offset) + else: + cd_strm = cd_noxor_strm + else: + cd_strm = self._cipher.encrypt(plaindata, offset) + + yield from cd_strm + + def _iterdecrypt(self, cipherdata: bytes, offset: int, /) -> Generator[int, None, None]: + if not cipherdata: + return + + if self._keystream_decrypt_available: + prexor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'prexor_decrypt', None) + postxor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'postxor_decrypt', None) + keystream = self._cipher.keystream('decrypt', len(cipherdata), offset) + + if prexor: + cd_strm = prexor(cipherdata, offset) + else: + cd_strm = cipherdata + pd_noxor_strm = (cd_byte ^ ks_byte for cd_byte, ks_byte in zip(cd_strm, keystream)) + if postxor: + pd_strm = postxor(pd_noxor_strm, offset) + else: + pd_strm = pd_noxor_strm + else: + pd_strm = self._cipher.decrypt(cipherdata, offset) + + yield from pd_strm + + def _iterdecrypt_untilnl(self, cipherdata: bytes, offset: int, /) -> Generator[int, None, None]: + if not cipherdata: + return + + if self._keystream_decrypt_available: + prexor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'prexor_decrypt', None) + postxor: Callable[[BytesLike, IntegerLike], Iterator[int]] | None = getattr(self._cipher, 'postxor_decrypt', None) + keystream = self._cipher.keystream('decrypt', len(cipherdata), offset) + + if prexor: + cd_strm = prexor(cipherdata, offset) + else: + cd_strm = cipherdata + pd_noxor_strm = (cd_byte ^ ks_byte for cd_byte, ks_byte in zip(cd_strm, keystream)) + if postxor: + pd_strm = postxor(pd_noxor_strm, offset) + else: + pd_strm = pd_noxor_strm + else: + pd_strm = self._cipher.decrypt(cipherdata, offset) + + for pd_byte in pd_strm: + yield pd_byte + if pd_byte == 10: + return + + def can_encrypt(self) -> bool: + return self._encrypt_available + + def can_decrypt(self) -> bool: + return self._decrypt_available + + def getbuffer(self, nocryptlayer: bool = False) -> memoryview: + if nocryptlayer: + return super().getbuffer() + else: + raise io.UnsupportedOperation('memoryview with crypt layer is not supported') + + def getvalue(self, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().getvalue() + elif not self.can_decrypt(): + raise io.UnsupportedOperation('getvalue with crypt layer') + else: + return bytes(self._iterdecrypt(super().getvalue(), 0)) + + def read(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().read(size) + elif not self.can_decrypt(): + raise io.UnsupportedOperation('read with crypt layer') + else: + curpos = self.tell() + if size is None: + size = -1 + else: + size = toint(size) + if size < 0: + target_data = super().getvalue()[curpos:] + else: + target_data = super().getvalue()[curpos:curpos + size] + + result = bytes(self._iterdecrypt(target_data, curpos)) + self.seek(curpos + len(result), 0) + + return result + + def read1(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if not (nocryptlayer or self.can_decrypt()): + raise io.UnsupportedOperation('read1 with crypt layer') + return self.read(size, nocryptlayer) + + def readline(self, size: IntegerLike | None = -1, /, nocryptlayer: bool = False) -> bytes: + if nocryptlayer: + return super().readline(size) + elif not self.can_decrypt(): + raise io.UnsupportedOperation('readline with crypt layer') + else: + curpos = self.tell() + if size is None: + size = -1 + else: + size = toint(size) + if size < 0: + blksize = self._DEFAULT_BUFFER_SIZE + results = [] + while 1: + target_data = super().getvalue()[curpos:curpos + blksize] + if not target_data: + return b'' + result_data = bytes(self._iterdecrypt_untilnl(target_data, curpos)) + self.seek(curpos + len(result_data), 0) + curpos += len(result_data) + results.append(result_data) + if len(result_data) != len(target_data): + return b''.join(results) + else: + target_data = super().getvalue()[curpos:curpos + size] + + result = bytes( + self._iterdecrypt_untilnl(target_data, curpos) + ) + self.seek(curpos + len(result), 0) + + return result + + def readlines(self, + hint: IntegerLike | None = None, /, + nocryptlayer: bool = False + ) -> list[bytes]: + if nocryptlayer: + return super().readlines(hint) + elif not self.can_decrypt(): + raise io.UnsupportedOperation('readlines with crypt layer') + else: + curpos = self.tell() + max_read_size = len(super().getvalue()[curpos:]) + if hint is None: + hint = -1 + else: + hint = toint(hint) + if hint < 0: + read_size = max_read_size + else: + read_size = min(hint, max_read_size) + + nbytes = 0 + lines = [] + while 1: + line = self.readline() + lines.append(line) + nbytes += len(line) + if nbytes >= read_size: + break + + return lines + + def readinto(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: + if nocryptlayer: + return super().readinto(buffer) + elif not self.can_decrypt(): + raise io.UnsupportedOperation('readinto with crypt layer') + else: + if not isinstance(buffer, memoryview): + buffer = memoryview(buffer) + buffer = buffer.cast('B') + + data = self.read(len(buffer)) + data_len = len(data) + + buffer[:data_len] = data + + return data_len + + def readinto1(self, buffer: WritableBuffer, /, nocryptlayer: bool = False) -> int: + if not (nocryptlayer or self.can_decrypt()): + raise io.UnsupportedOperation('readinto1 with crypt layer') + return self.readinto(buffer, nocryptlayer) + + def write(self, data: BytesLike, /, nocryptlayer: bool = False) -> int: + if nocryptlayer: + return super().write(data) + elif not self.can_encrypt(): + raise io.UnsupportedOperation('write with crypt layer') + else: + curpos = self.tell() + return super().write( + bytes( + self._iterencrypt(tobytes(data), curpos) + ) + ) + + def writelines(self, lines: Iterable[BytesLike], /, nocryptlayer: bool = False) -> None: + if not (nocryptlayer or self.can_encrypt()): + raise io.UnsupportedOperation('write with crypt layer') + for data in lines: + self.write(data, nocryptlayer) + + def __iter__(self): + return self + + def __next__(self) -> bytes: + if self._ITER_METHOD == 'line': + ret = self.readline(nocryptlayer=self._ITER_WITHOUT_CRYPTLAYER) + if not ret: + raise StopIteration + return ret + elif self._ITER_METHOD == 'block': + ret = self.read(self._DEFAULT_BUFFER_SIZE, nocryptlayer=self._ITER_WITHOUT_CRYPTLAYER) + if not ret: + raise StopIteration + return ret + else: + raise ValueError( + f"attribute 'ITER_METHOD' must be 'block' or 'line', not {repr(self._ITER_METHOD)}" + ) diff --git a/src/libtakiyasha/qmc/__init__.py b/src/libtakiyasha/qmc/__init__.py index 2738332..41d0622 100644 --- a/src/libtakiyasha/qmc/__init__.py +++ b/src/libtakiyasha/qmc/__init__.py @@ -1,110 +1,308 @@ # -*- coding: utf-8 -*- from __future__ import annotations +import re import warnings from base64 import b64decode, b64encode from dataclasses import dataclass -from secrets import token_bytes -from typing import IO, Literal +from pathlib import Path +from typing import Callable, IO, Literal, NamedTuple from .qmcdataciphers import HardenedRC4, Mask128 from .qmckeyciphers import QMCv2KeyEncryptV1, QMCv2KeyEncryptV2 -from ..common import CryptLayerWrappedIOSkel -from ..exceptions import CrypterCreatingError, CrypterSavingError -from ..keyutils import make_random_ascii_string +from ..exceptions import CrypterCreatingError +from ..keyutils import make_random_ascii_string, make_salt +from ..prototypes import EncryptedBytesIOSkel from ..typedefs import BytesLike, FilePath, IntegerLike -from ..typeutils import is_filepath, tobytes, toint_nofloat, verify_fileobj -from ..warns import CrypterCreatingWarning, CrypterSavingWarning +from ..typeutils import isfilepath, tobytes, verify_fileobj +from ..warns import CrypterSavingWarning + +warnings.filterwarnings(action='default', category=CrypterSavingWarning, module=__name__) +warnings.filterwarnings(action='default', category=DeprecationWarning, module=__name__) __all__ = [ + 'probe_qmc', + 'probe_qmcv1', + 'probe_qmcv2', + 'QMCv1', + 'QMCv2', 'QMCv2QTag', 'QMCv2STag', - 'QMCv1', - 'QMCv2' + 'QMCv1FileInfo', + 'QMCv2FileInfo' ] +QMCV1_SUFFIX_PATTERN = re.compile('\\.qmc[a-zA-Z0-9]{1,4}$', flags=re.IGNORECASE) + @dataclass class QMCv2QTag: - """解析、存储和重建 QMCv2 文件末尾的 QTag 数据。""" - master_key_encrypted_b64encoded: bytes - song_id: int - unknown_value1: bytes + """解析、存储和重建 QMCv2 文件末尾的 QTag 数据。不包括主密钥。""" + song_id: int = 0 + unknown: int = 2 @classmethod - def from_bytes(cls, bytestring: BytesLike) -> QMCv2QTag: - segments = tobytes(bytestring).split(b',') - if len(segments) != 3: + def load(cls, qtag_serialized: BytesLike, /): + qtag_serialized = tobytes(qtag_serialized) + qtag_serialized_splitted = qtag_serialized.split(b',') + if len(qtag_serialized_splitted) != 3: raise ValueError('invalid QMCv2 QTag data: the counts of splitted segments ' - f'should be equal to 3, not {len(segments)}' + f'should be equal to 3, not {len(qtag_serialized_splitted)}' ) - - master_key_encrypted_b64encoded = segments[0] - song_id = int(segments[1]) - unknown_value1 = segments[2] - - return cls(master_key_encrypted_b64encoded, song_id, unknown_value1) - - def to_bytes(self, with_tail: bool = False) -> bytes: - ret = b','.join([ - self.master_key_encrypted_b64encoded, - str(self.song_id).encode('utf-8'), - self.unknown_value1 - ] + master_key_encrypted_b64encoded = qtag_serialized_splitted[0] + song_id: int = int(qtag_serialized_splitted[1]) + unknown: int = int(qtag_serialized_splitted[2]) + + return master_key_encrypted_b64encoded, cls(song_id=song_id, unknown=unknown) + + def dump(self, master_key_encrypted_b64encoded: BytesLike, /) -> bytes: + return b','.join( + [ + tobytes(master_key_encrypted_b64encoded), + str(self.song_id).encode('ascii'), + str(self.unknown).encode('ascii') + ] ) - if with_tail: - return ret + len(ret).to_bytes(4, 'big') + b'QTag' - else: - return ret - - @classmethod - def new(cls, master_key: BytesLike, simple_key: BytesLike, song_id: IntegerLike, unknown_value1: BytesLike) -> QMCv2QTag: - master_key_encrypted = QMCv2KeyEncryptV1(simple_key).encrypt(master_key) - master_key_encrypted_b64encoded = b64encode(master_key_encrypted) - - return cls(master_key_encrypted_b64encoded, - toint_nofloat(song_id), - tobytes(unknown_value1) - ) @dataclass class QMCv2STag: """解析、存储和重建 QMCv2 文件末尾的 STag 数据。""" - song_id: int - unknown_value1: bytes - song_mid: str + song_id: int = 0 + unknown: int = 2 + song_mid: str = '0' * 14 @classmethod - def from_bytes(cls, bytestring: BytesLike) -> QMCv2STag: - segments = tobytes(bytestring).split(b',') - if len(segments) != 3: + def load(cls, stag_serialized: BytesLike, /): + stag_serialized = tobytes(stag_serialized) + stag_serialized_splitted = stag_serialized.split(b',') + if len(stag_serialized_splitted) != 3: raise ValueError('invalid QMCv2 STag data: the counts of splitted segments ' - f'should be equal to 3, not {len(segments)}' + f'should be equal to 3, not {len(stag_serialized_splitted)}' ) + song_id: int = int(stag_serialized_splitted[0]) + unknown: int = int(stag_serialized_splitted[1]) + song_mid: str = str(stag_serialized_splitted[2], encoding='ascii') + + return cls(song_id=song_id, unknown=unknown, song_mid=song_mid) + + def dump(self) -> bytes: + return b','.join( + [ + str(self.song_id).encode('ascii'), + str(self.unknown).encode('ascii'), + str(self.song_mid).encode('ascii') + ] + ) + + +class QMCv1FileInfo(NamedTuple): + """用于存储 QMCv1 文件的信息。""" + cipher_data_offset: int + cipher_data_len: int + + +class QMCv2FileInfo(NamedTuple): + """用于存储 QMCv2 文件的信息。""" + cipher_ctor: Callable[[...], HardenedRC4] | Callable[[...], Mask128] | None + cipher_data_offset: int + cipher_data_len: int + master_key_encrypted: bytes | None + master_key_encryption_ver: int | None + extra_info: QMCv2QTag | QMCv2STag | None - song_id = int(segments[0]) - unknown_value1 = segments[1] - song_mid = segments[2].decode('utf-8') - return cls(song_id, unknown_value1, song_mid) +def _guess_cipher_ctor(master_key: BytesLike, /, + is_encrypted: bool = True + ) -> Callable[[...], HardenedRC4] | Callable[[...], Mask128] | None: + if is_encrypted: + expected_keylen_mask128 = (272, 392) + expected_keylen_hardened_rc4 = (528, 736) + else: + expected_keylen_mask128 = (256, 256) + expected_keylen_hardened_rc4 = (512, 512) - def to_bytes(self, with_tail: bool = False) -> bytes: - raw_song_id = str(self.song_id).encode('utf-8') - raw_song_mid = self.song_mid.encode('utf-8') + master_key = tobytes(master_key) + if len(master_key) in expected_keylen_mask128: + return Mask128.from_qmcv2_key256 + elif len(master_key) in expected_keylen_hardened_rc4: + return HardenedRC4 + elif len(master_key) == 128 and not is_encrypted: + return Mask128 + + +def probe_qmcv1(filething: FilePath | IO[bytes], /, is_qmcv1: bool = False) -> tuple[Path | IO[bytes], QMCv1FileInfo | None]: + """探测源文件 ``filething`` 是否为一个 QMCv1 文件。 + + 返回一个 2 个元素长度的元组: + + - 第一个元素为 ``filething``; + - 如果 ``filething`` 是 QMCv1 文件,那么第二个元素为一个 ``QMCv1FileInfo`` 对象; + - 否则为 ``None``。 + + 目前难以通过文件结构识别 QMCv1 文件,因此本方法通过文件扩展名判断是否为 QMCv1 文件。 + 只要文件扩展名匹配下列正则表达式模式(不区分大小写),本方法就会将此文件视为一个 QMCv1 文件: + + ``^\\.qmc[a-zA-Z0-9]{1,4}$`` + + 对于不匹配以上正则表达式的文件扩展名(或者无法获取到文件扩展名),如果参数 + ``is_qmcv1=True``,本方法会跳过探测过程,认为此文件是一个 QMCv1 文件,并直接返回结果。 + + 本方法的返回值可以用于 ``QMCv1.open()`` 的第一个位置参数。 + + 本方法不适用于 QMCv2 文件的探测。 + + Args: + filething: 源文件的路径或文件对象 + is_qmcv1: 跳过探测过程,认为源文件是一个 QMCv1 文件 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 QMCv1 文件,那么第二个元素为一个 QMCv1FileInfo 对象;否则为 None。 + """ + + def operation(fd: IO[bytes]) -> QMCv1FileInfo | None: + filename = getattr(fd, 'name', None) + if filename is None and not is_qmcv1: + return + filepath = Path(filename) + if QMCV1_SUFFIX_PATTERN.search(filepath.suffix) or is_qmcv1: + return QMCv1FileInfo( + cipher_data_offset=0, + cipher_data_len=fd.seek(0, 2) + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + return Path(filething), operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_origpos = fileobj.tell() + prs = operation(fileobj) + fileobj.seek(fileobj_origpos, 0) - ret = b','.join([raw_song_id, self.unknown_value1, raw_song_mid]) - if with_tail: - return ret + len(ret).to_bytes(4, 'big') + b'STag' + return fileobj, prs + + +def probe_qmcv2(filething: FilePath | IO[bytes], /) -> tuple[Path | IO[bytes], QMCv2FileInfo | None]: + """探测源文件 ``filething`` 是否为一个 QMCv2 文件。 + + 返回一个 2 个元素长度的元组: + + - 第一个元素为 ``filething``; + - 如果 ``filething`` 是 QMCv2 文件,那么第二个元素为一个 ``QMCv2FileInfo`` 对象; + - 否则为 ``None``。 + + 本方法的返回值可以用于 ``QMCv2.open()`` 的第一个位置参数。 + + 本方法不适用于 QMCv1 文件的探测。 + + Args: + filething: 源文件的路径或文件对象 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 QMCv2 文件,那么第二个元素为一个 QMCv2FileInfo 对象;否则为 None。 + """ + + def operation(fd: IO[bytes]) -> QMCv2FileInfo | None: + total_size = fd.seek(-4, 2) + 4 + tail_data = fd.read(4) + + if tail_data == b'STag': + fd.seek(-8, 2) + tag_serialized_len = int.from_bytes(fd.read(4), 'big') + if tag_serialized_len > (total_size - 8): + return + cipher_data_len = fd.seek(-(tag_serialized_len + 8), 2) + extra_info = QMCv2STag.load(fd.read(tag_serialized_len)) + + cipher_ctor = None + master_key_encrypted = None + master_key_encryption_ver = None + elif tail_data == b'QTag': + fd.seek(-8, 2) + tag_serialized_len = int.from_bytes(fd.read(4), 'big') + if tag_serialized_len > (total_size - 8): + return + cipher_data_len = fd.seek(-(tag_serialized_len + 8), 2) + master_key_encrypted_b64encoded, extra_info = QMCv2QTag.load(fd.read(tag_serialized_len)) + master_key_encrypted = b64decode(master_key_encrypted_b64encoded) + + cipher_ctor = _guess_cipher_ctor(master_key_encrypted) + master_key_encryption_ver = 1 else: - return ret + extra_info = None + master_key_encrypted_b64encoded_len = int.from_bytes(tail_data, 'little') + if master_key_encrypted_b64encoded_len > total_size - 4: + return + cipher_data_len = fd.seek(-(master_key_encrypted_b64encoded_len + 4), 2) + master_key_encrypted_b64encoded = fd.read(master_key_encrypted_b64encoded_len) + try: + master_key_encrypted_b64encoded.decode('ascii') + except UnicodeDecodeError: + return + master_key_encrypted_b64decoded = b64decode(master_key_encrypted_b64encoded) + if master_key_encrypted_b64decoded.startswith(b'QQMusic EncV2,Key:'): + master_key_encrypted = master_key_encrypted_b64decoded[18:] + master_key_encryption_ver = 2 + else: + master_key_encrypted = master_key_encrypted_b64decoded + master_key_encryption_ver = 1 + cipher_ctor = _guess_cipher_ctor(master_key_encrypted) + + return QMCv2FileInfo(cipher_ctor=cipher_ctor, + cipher_data_offset=0, + cipher_data_len=cipher_data_len, + master_key_encrypted=master_key_encrypted, + master_key_encryption_ver=master_key_encryption_ver, + extra_info=extra_info + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + return Path(filething), operation(fileobj) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_origpos = fileobj.tell() + prs = operation(fileobj) + fileobj.seek(fileobj_origpos, 0) + + return fileobj, prs - @classmethod - def new(cls, song_id: IntegerLike, unknown_value1: BytesLike, song_mid: str) -> QMCv2STag: - return cls(toint_nofloat(song_id), tobytes(unknown_value1), str(song_mid)) +def probe_qmc( + filething: FilePath | IO[bytes], / +) -> tuple[Path | IO[bytes], QMCv1FileInfo | None] | tuple[Path | IO[bytes], QMCv2FileInfo | None]: + """探测源文件 ``filething`` 是否为一个 QMCv1 或 QMCv2 文件。 -class QMCv1(CryptLayerWrappedIOSkel): + 返回一个 2 个元素长度的元组: + + - 第一个元素为 ``filething``; + - 如果 ``filething`` 是 QMCv1 文件,那么第二个元素为一个 ``QMCv1FileInfo`` 对象; + - 如果 ``filething`` 是 QMCv2 文件,那么第二个元素为一个 ``QMCv2FileInfo`` 对象; + - 如果都不是,则为 ``None``。 + + 本方法的返回值可以用于 ``QMCv1.open()`` 和 ``QMCv2.open()`` 的第一个位置参数。 + + Args: + filething: 源文件的路径或文件对象 + Returns: + 一个 2 个元素长度的元组:第一个元素为 filething;如果 + filething 是 QMCv1 文件,那么第二个元素为一个 QMCv1FileInfo 对象;如果 + filething 是 QMCv2 文件,那么第二个元素为一个 QMCv2FileInfo 对象;否则为 None。 + """ + fthing, fileinfo = probe_qmcv2(filething) + if fileinfo: + return fthing, fileinfo + return probe_qmcv1(filething) + + +class QMCv1(EncryptedBytesIOSkel): """基于 BytesIO 的 QMCv1 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -114,51 +312,22 @@ class QMCv1(CryptLayerWrappedIOSkel): 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 QMCv1 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``QMCv1.new()`` 和 ``QMCv1.from_file()`` 新建或打开已有 QMCv1 文件, - 使用已有 QMCv1 对象的 ``self.to_file()`` 方法将其保存到文件。 + ``QMCv1.new()`` 和 ``QMCv1.open()`` 新建或打开已有 QMCv1 文件, + 使用已有 QMCv1 对象的 ``save()`` 方法将其保存到文件。 """ @property - def cipher(self) -> Mask128: - return self._cipher - - @property - def master_key(self): - return self.cipher.mask128 - - def __init__(self, cipher: Mask128, /, initial_bytes: BytesLike = b'') -> None: - """基于 BytesIO 的 QMCv1 透明加密二进制流。 - - 所有读写相关方法都会经过透明加密层处理: - 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 - - 调用读写相关方法时,附加参数 ``nocryptlayer=True`` - 可绕过透明加密层,访问缓冲区内的原始加密数据。 - - 如果你要新建一个 QMCv1 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``QMCv1.new()`` 和 ``QMCv1.from_file()`` 新建或打开已有 QMCv1 文件, - 使用 ``self.to_file()`` 将已有 QMCv1 对象保存到文件。 - """ - super().__init__(cipher, initial_bytes) - if not isinstance(cipher, Mask128): - raise TypeError(f"'{type(self).__name__}' " - f"only support cipher '{Mask128.__module__}.{Mask128.__name__}', " - f"not '{type(self._cipher).__name__}'" - ) - - @classmethod - def new(cls) -> QMCv1: - """创建并返回一个全新的空 QMCv1 对象。""" - master_key = token_bytes(128) - - return cls(Mask128(master_key)) + def acceptable_ciphers(self): + return [Mask128] @classmethod def from_file(cls, qmcv1_filething: FilePath | IO[bytes], /, master_key: BytesLike - ) -> QMCv1: - """打开一个 QMCv1 文件或文件对象 ``qmcv1_filething``。 + ): + """(已弃用,且将会在后续版本中删除。请尽快使用 ``QMCv1.open()`` 代替。) + + 打开一个 QMCv1 文件或文件对象 ``qmcv1_filething``。 第一个位置参数 ``qmcv1_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``qmcv1_filething`` @@ -167,33 +336,92 @@ def from_file(cls, 第二个位置参数 ``master_key`` 用于解密音频数据,长度仅限 44、128 或 256 位。 如果不符合长度要求,会触发 ``ValueError``。 """ - master_key = tobytes(master_key) - if len(master_key) == 44: - cipher = Mask128.from_qmcv1_mask44(master_key) - elif len(master_key) == 128: - cipher = Mask128(master_key) - elif len(master_key) == 256: - cipher = Mask128.from_qmcv1_mask256(master_key) + warnings.warn( + DeprecationWarning( + f'{cls.__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {cls.__name__}.open() instead.' + ) + ) + return cls.open(qmcv1_filething, mask=master_key) + + @classmethod + def open(cls, + filething_or_info: tuple[Path | IO[bytes], QMCv1FileInfo | None] | FilePath | IO[bytes], /, + mask: BytesLike + ): + """打开一个 QMCv1 文件,并返回一个 ``QMCv1`` 对象。 + + 第一个位置参数 ``filething`` 需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + 第二个参数 ``mask`` 是必需的,用于主密钥。其长度必须为 44、128 或 256 位。 + + Args: + filething_or_info: 源文件的路径或文件对象 + mask: 文件的主密钥,其长度必须为 44、128 或 256 位 + Raises: + ValueError: mask 的长度不符合上述要求 + """ + mask = tobytes(mask) + + def operation(fd: IO[bytes]) -> cls: + if len(mask) == 44: + cipher = Mask128.from_qmcv1_mask44(mask) + elif len(mask) == 128: + cipher = Mask128(mask) + elif len(mask) == 256: + cipher = Mask128.from_qmcv1_mask256(mask) + else: + raise ValueError( + f"the length of argument 'mask' must be 44, 128, or 256, not {len(mask)}" + ) + + fd.seek(fileinfo.cipher_data_offset, 0) + return cls(cipher, fd.read(fileinfo.cipher_data_len)) + + if isinstance(filething_or_info, tuple): + filething_or_info: tuple[Path | IO[bytes], QMCv1FileInfo | None] + if len(filething_or_info) != 2: + raise TypeError( + "first argument 'filething_or_info' must be a file path, a file object, " + "or a tuple of probe_qmc(), probe_qmcv1() returns" + ) + filething, fileinfo = filething_or_info else: - raise ValueError("the length of second argument 'master_key' " - f"must be 44, 128 or 256, not {len(master_key)}" - ) + filething, fileinfo = probe_qmcv1(filething_or_info) - if is_filepath(qmcv1_filething): - with open(qmcv1_filething, mode='rb') as qmcv1_fileobj: - instance = cls(cipher, qmcv1_fileobj.read()) + if fileinfo is None: + raise CrypterCreatingError( + f"{repr(filething)} is not a QMCv1 file" + ) + elif not isinstance(fileinfo, QMCv1FileInfo): + raise TypeError( + f"second element of the tuple must be QMCv1FileInfo or None, not {type(fileinfo).__name__}" + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + instance = operation(fileobj) + instance._name = Path(filething) else: - qmcv1_fileobj = verify_fileobj(qmcv1_filething, 'binary', - verify_readable=True - ) - instance = cls(cipher, qmcv1_fileobj.read()) + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_sourcefile = getattr(fileobj, 'name', None) + instance = operation(fileobj) - instance._name = getattr(qmcv1_fileobj, 'name', None) + if fileobj_sourcefile is not None: + instance._name = Path(fileobj_sourcefile) return instance - def to_file(self, qmcv1_filething: FilePath | IO[bytes] = None, /) -> None: - """将当前 QMCv1 对象的内容保存到文件 ``qmcv1_filething``。 + def to_file(self, qmcv1_filething: FilePath | IO[bytes] = None) -> None: + """(已弃用,且将会在后续版本中删除。请尽快使用 ``QMCv2.save()`` 代替。) + + 将当前 QMCv1 对象的内容保存到文件 ``qmcv1_filething``。 第一个位置参数 ``qmcv1_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``qmcv1_filething`` @@ -203,25 +431,64 @@ def to_file(self, qmcv1_filething: FilePath | IO[bytes] = None, /) -> None: 如果未提供 ``qmcv1_filething``,则会尝试写入 ``self.name`` 指向的文件。如果两者都为空或未提供,则会触发 ``CrypterSavingError``。 """ - if qmcv1_filething is None: - if self.name is None: - raise CrypterSavingError( - "cannot determine the target file: " - "argument 'qmcv1_filething' and attribute self.name are None or unspecified" + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.save() instead.' + ) + ) + return self.save(qmcv1_filething) + + def save(self, filething: FilePath | IO[bytes] = None) -> None: + """将当前对象保存为一个新 QMCv1 文件。 + + 第一个参数 ``filething`` 是可选的,如果提供此参数,需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + Args: + filething: 目标文件的路径或文件对象 + + Raises: + TypeError: 当前对象的属性 source 和参数 filething 都为空,无法保存文件 + """ + + def operation(fd: IO[bytes]): + fd.write(self.getvalue(nocryptlayer=True)) + + if filething is None: + if self.source is None: + raise TypeError( + "attribute 'self.source' and argument 'filething' are empty, " + "don't know which file to save to" ) - qmcv1_filething = self.name + filething = self.source - if is_filepath(qmcv1_filething): - with open(qmcv1_filething, mode='wb') as qmcv1_fileobj: - qmcv1_fileobj.write(self.getvalue(nocryptlayer=True)) + if isfilepath(filething): + with open(filething, mode='wb') as fileobj: + return operation(fileobj) else: - qmcv1_fileobj = verify_fileobj(qmcv1_filething, 'binary', - verify_writable=True - ) - qmcv1_fileobj.write(self.getvalue(nocryptlayer=True)) + fileobj = verify_fileobj(filething, 'binary', + verify_seekable=True, + verify_writable=True + ) + return operation(fileobj) + @classmethod + def new(cls, mask: BytesLike = None): + """返回一个空 QMCv1 对象。 + + 第一个参数 ``mask`` 是可选的,如果提供,将被用作主密钥。 + """ + if mask is None: + mask = make_salt(128) + else: + mask = tobytes(mask) + return cls(Mask128(mask)) -class QMCv2(CryptLayerWrappedIOSkel): + +class QMCv2(EncryptedBytesIOSkel): """基于 BytesIO 的 QMCv2 透明加密二进制流。 所有读写相关方法都会经过透明加密层处理: @@ -231,200 +498,255 @@ class QMCv2(CryptLayerWrappedIOSkel): 可绕过透明加密层,访问缓冲区内的原始加密数据。 如果你要新建一个 QMCv2 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``QMCv2.new()`` 和 ``QMCv2.from_file()`` 新建或打开已有 QMCv2 文件, - 使用已有 QMCv2 对象的 ``self.to_file()`` 方法将其保存到文件。 + ``QMCv2.new()`` 和 ``QMCv2.open()`` 新建或打开已有 QMCv2 文件, + 使用已有 QMCv2 对象的 ``save()`` 方法将其保存到文件。 """ @property - def cipher(self) -> HardenedRC4 | Mask128: - return self._cipher + def acceptable_ciphers(self): + return [HardenedRC4, Mask128] - @property - def master_key(self) -> bytes: - if isinstance(self.cipher, HardenedRC4): - return self.cipher.master_key - elif isinstance(self.cipher, Mask128): - if self.cipher.original_master_key is None: - return self.cipher.mask128 - return self.cipher.original_master_key + def __init__(self, cipher: HardenedRC4 | Mask128, /, initial_bytes: BytesLike = b'') -> None: + """基于 BytesIO 的 QMCv2 透明加密二进制流。 - @property - def simple_key(self) -> bytes | None: - return self._simple_key + 所有读写相关方法都会经过透明加密层处理: + 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 - @simple_key.setter - def simple_key(self, value: BytesLike) -> None: - self._simple_key = tobytes(value) + 调用读写相关方法时,附加参数 ``nocryptlayer=True`` + 可绕过透明加密层,访问缓冲区内的原始加密数据。 - @simple_key.deleter - def simple_key(self) -> None: - self._simple_key = None + 如果你要新建一个 QMCv2 对象,不要直接调用 ``__init__()``,而是使用构造器方法 + ``QMCv2.new()`` 和 ``QMCv2.open()`` 新建或打开已有 QMCv2 文件, + 使用已有 QMCv2 对象的 ``save()`` 方法将其保存到文件。 - @property - def mix_key1(self) -> bytes | None: - return self._mix_key1 + Args: + cipher: 要使用的 cipher,必须是一个 libtakiyasha.qmc.qmcdataciphers.Mask128/HardenedRC4 对象 + initial_bytes: 内置缓冲区的初始数据 + """ + super().__init__(cipher, initial_bytes) - @mix_key1.setter - def mix_key1(self, value: BytesLike) -> None: - self._mix_key1 = tobytes(value) + self._extra_info: QMCv2QTag | QMCv2STag | None = None - @mix_key1.deleter - def mix_key1(self) -> None: - self._mix_key1 = None + self._core_key_deprecated: bytes | None = None + self._garble_key1_deprecated: bytes | None = None + self._garble_key2_deprecated: bytes | None = None @property - def mix_key2(self) -> bytes | None: - return self._mix_key2 + def extra_info(self) -> QMCv2QTag | QMCv2STag | None: + """源文件末尾的附加信息(如果有),根据类型可分为 QTag 或 STag。""" + return self._extra_info + + @extra_info.setter + def extra_info(self, value: QMCv2QTag | QMCv2STag) -> None: + """源文件末尾的附加信息(如果有),根据类型可分为 QTag 或 STag。""" + if isinstance(value, (QMCv2QTag, QMCv2STag)): + self._extra_info = value + elif value is None: + raise TypeError( + f"None cannot be assigned to attribute 'extra_info'. " + f"Use `del self.extra_info` instead" + ) + else: + raise TypeError( + f"attribute 'extra_info' must be QMCv2QTag or QMCv2STag, not {repr(value)}" + ) - @mix_key2.setter - def mix_key2(self, value: BytesLike) -> None: - self._mix_key2 = tobytes(value) + @extra_info.deleter + def extra_info(self) -> None: + """源文件末尾的附加信息(如果有),根据类型可分为 QTag 或 STag。""" + self._extra_info = None - @mix_key2.deleter - def mix_key2(self) -> None: - self._mix_key2 = None + @property + def master_key(self) -> bytes | None: + if isinstance(self.cipher, Mask128): + ret = self.cipher.getkey('original') + if ret: + return ret + + return super().master_key @property - def song_id(self) -> int: - return self._song_id + def core_key(self) -> bytes | None: + """(已弃用,且将会在后续版本中删除。) - @song_id.setter - def song_id(self, value: IntegerLike) -> None: - self._song_id = toint_nofloat(value) + 核心密钥,用于加/解密主密钥。 - @song_id.deleter - def song_id(self) -> None: - self._song_id = 0 + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key or {type(self).__name__}.simple_key' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + return self._core_key_deprecated - @property - def song_mid(self) -> str: - return self._song_mid + @core_key.setter + def core_key(self, value: BytesLike) -> None: + """(已弃用,且将会在后续版本中删除。) - @song_mid.setter - def song_mid(self, value: str) -> None: - self._song_mid = str(value) + 核心密钥,用于加/解密主密钥。 - @song_mid.deleter - def song_mid(self) -> None: - self._song_mid = '0' * 14 + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key or {type(self).__name__}.simple_key' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'core_key'. " + f"Use `del self.core_key` instead" + ) + self._core_key_deprecated = tobytes(value) - @property - def unknown_value1(self) -> bytes: - return self._unknown_value1 + @core_key.deleter + def core_key(self) -> None: + """(已弃用,且将会在后续版本中删除。) - @unknown_value1.setter - def unknown_value1(self, value: BytesLike) -> None: - self._unknown_value1 = tobytes(value) + 核心密钥,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.core_key or {type(self).__name__}.simple_key' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage the core key by your self.' + ) + ) + self._core_key_deprecated = None - @unknown_value1.deleter - def unknown_value1(self) -> None: - self._unknown_value1 = b'2' + simple_key = core_key @property - def qtag(self) -> QMCv2QTag | None: - if self.simple_key is not None and len(self.master_key) in (256, 512): - return QMCv2QTag.new( - master_key=self.master_key, - simple_key=self.simple_key, - song_id=self.song_id, - unknown_value1=self.unknown_value1 + def garble_key1(self) -> bytes | None: + """(已弃用,且将会在后续版本中删除。) + + 混淆密钥 1,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key1 or {type(self).__name__}.mix_key1' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' ) + ) + return self._garble_key1_deprecated - @property - def stag(self) -> QMCv2STag: - return QMCv2STag( - song_id=self.song_id, - unknown_value1=self.unknown_value1, - song_mid=self.song_mid + @garble_key1.setter + def garble_key1(self, value: BytesLike) -> None: + """(已弃用,且将会在后续版本中删除。) + + 混淆密钥 1,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key1 or {type(self).__name__}.mix_key1' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' + ) ) + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'garble_key1'. " + f"Use `del self.garble_key1` instead" + ) + self._garble_key1_deprecated = tobytes(value) - def __init__(self, - cipher: HardenedRC4 | Mask128, /, - initial_bytes: BytesLike = b'', - simple_key: BytesLike = None, - mix_key1: BytesLike = None, - mix_key2: BytesLike = None, *, - song_id: IntegerLike = 0, - song_mid: str = '0' * 14, - unknown_value1: BytesLike = b'2' - ) -> None: - """基于 BytesIO 的 QMCv2 透明加密二进制流。 + @garble_key1.deleter + def garble_key1(self): + """(已弃用,且将会在后续版本中删除。) - 所有读写相关方法都会经过透明加密层处理: - 读取时,返回解密后的数据;写入时,向缓冲区写入加密后的数据。 + 混淆密钥 1,用于加/解密主密钥。 - 调用读写相关方法时,附加参数 ``nocryptlayer=True`` - 可绕过透明加密层,访问缓冲区内的原始加密数据。 + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key1 or {type(self).__name__}.mix_key1' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' + ) + ) + self._garble_key1_deprecated = None - 如果你要新建一个 QMCv2 对象,不要直接调用 ``__init__()``,而是使用构造器方法 - ``QMCv2.new()`` 和 ``QMCv2.from_file()`` 新建或打开已有 QMCv2 文件, - 使用已有 QMCv2 对象的 ``self.to_file()`` 方法将其保存到文件。 + mix_key1 = garble_key1 + + @property + def garble_key2(self) -> bytes | None: + """(已弃用,且将会在后续版本中删除。) + + 混淆密钥 2,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 """ - super().__init__(cipher, initial_bytes) - if not isinstance(cipher, (HardenedRC4, Mask128)): - raise TypeError(f'unsupported Cipher: ' - f'supports ' - f'{Mask128.__module__}.{Mask128.__name__} and ' - f'{HardenedRC4.__module__}.{HardenedRC4.__name__}, ' - f'not {type(cipher).__name__}' - ) + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key2 or {type(self).__name__}.mix_key2' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' + ) + ) + return self._garble_key2_deprecated - if simple_key is None: - self._simple_key = None - else: - self._simple_key = tobytes(simple_key) - if mix_key1 is None: - self._mix_key1 = None - else: - self._mix_key1 = tobytes(mix_key1) - if mix_key2 is None: - self._mix_key2 = None - else: - self._mix_key2 = tobytes(mix_key2) - self._song_id = toint_nofloat(song_id) - self._song_mid = str(song_mid) - self._unknown_value1 = tobytes(unknown_value1) + @garble_key2.setter + def garble_key2(self, value: BytesLike) -> None: + """(已弃用,且将会在后续版本中删除。) - @classmethod - def new(cls, - cipher_type: Literal['mask', 'rc4'], /, - simple_key: BytesLike = None, - mix_key1: BytesLike = None, - mix_key2: BytesLike = None, *, - song_id: IntegerLike = 0, - song_mid: str = '0' * 14, - unknown_value1: BytesLike = b'2' - ) -> QMCv2: - """创建并返回一个全新的空 QMCv2 对象。 - - 第一个位置参数 ``cipher_type`` 决定此 QMCv2 对象的透明加密层使用哪种加密算法, - 仅支持 ``'mask'`` 和 ``'rc4'``。 - - 位置参数 ``simple_key``、``mix_key1``、``mix_key2`` - 都是可选参数,但已经在这里填写的参数,在将此 QMCv2 对象保存到文件时不必再填写。 - - 关键字参数 ``song_id``、``song_mid``、``unknown_value1`` 也是可选参数。 - 这些参数是无关紧要的。 + 混淆密钥 2,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 """ - if cipher_type == 'mask': - cipher = Mask128.from_qmcv2_key256(make_random_ascii_string(256).encode('utf-8')) - elif cipher_type == 'rc4': - cipher = HardenedRC4(make_random_ascii_string(512).encode('utf-8')) - elif isinstance(cipher_type, str): - raise ValueError(f"first argument 'cipher_type' must be 'mask' or 'rc4', not {cipher_type}") - else: - raise TypeError(f"first argument 'cipher_type' must be str, " - f"not {type(cipher_type).__name__}" - ) + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key2 or {type(self).__name__}.mix_key2' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' + ) + ) + if value is None: + raise TypeError( + f"None cannot be assigned to attribute 'garble_key2'. " + f"Use `del self.garble_key2` instead" + ) + self._garble_key2_deprecated = tobytes(value) + + @garble_key2.deleter + def garble_key2(self): + """(已弃用,且将会在后续版本中删除。) + + 混淆密钥 2,用于加/解密主密钥。 + + ``QMCv2.from_file()`` 会在当前对象被创建时设置此属性;而 ``QMCv2.open()`` 则不会。 + """ + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.garble_key2 or {type(self).__name__}.mix_key2' + f'is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'You need to manage garble keys by your self.' + ) + ) + self._garble_key2_deprecated = None - return cls(cipher, - simple_key=simple_key, - mix_key1=mix_key1, - mix_key2=mix_key2, - song_id=song_id, - song_mid=song_mid, - unknown_value1=unknown_value1 - ) + mix_key2 = garble_key2 @classmethod def from_file(cls, @@ -433,8 +755,10 @@ def from_file(cls, mix_key1: BytesLike = None, mix_key2: BytesLike = None, *, master_key: BytesLike = None, - ) -> QMCv2: - """打开一个 QMCv2 文件或文件对象 ``qmcv2_filething``。 + ): + """(已弃用,且将会在后续版本中删除。请尽快使用 ``QMCv2.open()`` 代替。) + + 打开一个 QMCv2 文件或文件对象 ``qmcv2_filething``。 第一个位置参数 ``qmcv2_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``qmcv2_filething`` @@ -453,145 +777,195 @@ def from_file(cls, 以上特定条件中的必需参数,如果缺失,则会触发 ``ValueError``。 """ + warnings.warn( + DeprecationWarning( + f'{cls.__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {cls.__name__}.open() instead.' + ) + ) + instance = cls.open(qmcv2_filething, + core_key=simple_key, + garble_key1=mix_key1, + garble_key2=mix_key2, + master_key=master_key + ) + instance._core_key_deprecated = tobytes(simple_key) + instance._garble_key1_deprecated = tobytes(mix_key1) + instance._garble_key2_deprecated = tobytes(mix_key2) - def operation(fileobj: IO[bytes]) -> QMCv2: - fileobj_endpos = fileobj.seek(0, 2) - fileobj.seek(-4, 2) - tail_data = fileobj.read(4) - - song_id = 0 - song_mid = '0' * 14 - unknown_value1 = b'2' + @classmethod + def open(cls, + filething_or_info: tuple[Path | IO[bytes], QMCv2FileInfo | None] | FilePath | IO[bytes], /, + core_key: BytesLike = None, + garble_key1: BytesLike = None, + garble_key2: BytesLike = None, + master_key: BytesLike = None, + encrypt_method: Literal['map', 'mask', 'rc4'] = None + ): + """打开一个 QMCv2 文件,并返回一个 ``QMCv2`` 对象。 + + 第一个位置参数 ``filething_or_info`` 需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + + ``filething_or_info`` 也可以接受 ``probe_qmc()`` 和 ``probe_qmcv2()`` 函数的返回值: + 一个包含两个元素的元组,第一个元素是源文件的路径或文件对象,第二个元素是源文件的信息。 + + 第二个参数 ``core_key`` 一般情况下是必需的,用于解密文件内嵌的主密钥。 + 例外:如果你提供了第五个参数 ``master_key``,那么它是可选的。 + + 第三、第四个参数 ``garble_key1`` 和 ``garble_key2``,仅在探测到文件内嵌的主密钥使用了 + V2 加密时是必需的。在其他情况下,它们的值会被忽略。 + + 第五个参数 ``master_key`` 可选,如果提供,将会被作为主密钥使用, + 而文件内置的主密钥会被忽略,``core_key``、``garble_key1`` 和 ``garble_key2`` + 也不再是必需参数。 + 例外:如果探测到文件未嵌入任何形式的密钥,那么此参数是必需的。 + + 第六个参数 ``encrypt_method`` 用于指定文件数据使用的加密方式,支持以下值: + + - ``'map'`` 或 ``'mask'`` - 掩码表(Mask128) + - ``'rc4'`` - 强化版 RC4(HardenedRC4) + - ``None`` - 不指定,由 ``probe_qmcv2()`` 自行探测 + + 此参数的设置会覆盖 ``probe_qmc()`` 或 ``probe_qmcv2()`` 的探测结果。 + + Args: + filething_or_info: 源文件的路径或文件对象,或者 probe_qmc() 和 probe_qmcv2() 的返回值 + core_key: 核心密钥,用于解密文件内嵌的主密钥 + garble_key1: 混淆密钥 1,用于解密使用 V2 加密的主密钥 + garble_key2: 混淆密钥 2,用于解密使用 V2 加密的主密钥 + master_key: 如果提供,将会被作为主密钥使用,而文件内置的主密钥会被忽略 + encrypt_method: 用于指定文件数据使用的加密方式,支持 'map'、'mask'、'rc4' 或 None + Raises: + TypeError: 参数 core_key 和 master_key 都未提供,或者缺少 garble_key1 或 garble_key2 用于解密 V2 加密的主密钥 + ValueError: encrypt_method 的值不符合上述条件 + CrypterCreatingError: probe_qmcv2() 返回的文件信息中,master_key_encryption_ver 的值是当前不支持的 + """ + if core_key is not None: + core_key = tobytes(core_key) + if garble_key1 is not None: + garble_key1 = tobytes(garble_key1) + if garble_key2 is not None: + garble_key2 = tobytes(garble_key2) + if master_key is not None: + master_key = tobytes(master_key) + if encrypt_method is not None: + if encrypt_method not in ('map', 'mask', 'rc4'): + if isinstance(encrypt_method, str): + raise ValueError( + f"argument 'encrypt_method' must be 'map', 'mask', or 'rc4', " + f"not {repr(encrypt_method)}" + ) + else: + raise TypeError( + f"argument 'encrypt_method' must be str, " + f"not {type(encrypt_method).__name__}" + ) - if tail_data == b'STag': - if master_key is None: - raise ValueError("'master_key' is required for QMCv2 file with STag " - "audio data encryption/decryption" - ) - fileobj.seek(-8, 2) - stag_len = int.from_bytes(fileobj.read(4), 'big') - if stag_len + 8 > fileobj_endpos: - raise CrypterCreatingError( - f'{repr(qmcv2_filething)} is not a valid QMCv2 file: ' - f'QMCv2 STag data length ({stag_len + 8}) ' - f'is greater than file length ({fileobj_endpos})' + def operation(fd: IO[bytes]) -> cls: + cipher_data_len = fileinfo.cipher_data_len + extra_info = fileinfo.extra_info + master_key_encrypted = fileinfo.master_key_encrypted + master_key_encryption_ver = fileinfo.master_key_encryption_ver + cipher_ctor = fileinfo.cipher_ctor + + if master_key is None: + if isinstance(extra_info, QMCv2STag): + raise TypeError( + "argument 'master_key' is required to " + "QMCv2 file ends with STag" ) - audio_encrypted_len = fileobj.seek(-(stag_len + 8), 2) - stag = QMCv2STag.from_bytes(fileobj.read(stag_len)) - song_id = stag.song_id - song_mid = stag.song_mid - unknown_value1 = stag.unknown_value1 - target_master_key = master_key - fileobj.seek(0, 0) - initial_bytes = fileobj.read(audio_encrypted_len) - else: - if simple_key is None and master_key is None: - raise ValueError("'simple_key' is required for QMCv2 file master key decryption") - if tail_data == b'QTag': - fileobj.seek(-8, 2) - qtag_len = int.from_bytes(fileobj.read(4), 'big') - if qtag_len + 8 > fileobj_endpos: - raise CrypterCreatingError( - f'{repr(qmcv2_filething)} is not a valid QMCv2 file: ' - f'QMCv2 QTag data length ({qtag_len + 8}) ' - f'is greater than file length ({fileobj_endpos})' + if core_key is None: + raise TypeError( + "argument 'core_key' is required to " + "decrypt the protected master key" + ) + if master_key_encryption_ver == 1: + target_master_key = QMCv2KeyEncryptV1(core_key).decrypt( + master_key_encrypted + ) + elif master_key_encryption_ver == 2: + if garble_key1 is None and garble_key2 is None: + raise TypeError( + "argument 'garble_key1' and 'garble_key2' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" ) - audio_encrypted_len = fileobj.seek(-(qtag_len + 8), 2) - qtag = QMCv2QTag.from_bytes(fileobj.read(qtag_len)) - master_key_encrypted_b64encoded = qtag.master_key_encrypted_b64encoded - song_id = qtag.song_id - unknown_value1 = qtag.unknown_value1 - target_master_key = master_key - if target_master_key is None: - target_master_key = QMCv2KeyEncryptV1(simple_key).decrypt( - b64decode(master_key_encrypted_b64encoded) + elif garble_key1 is None: + raise TypeError( + "argument 'garble_key1' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" ) - fileobj.seek(0, 0) - initial_bytes = fileobj.read(audio_encrypted_len) - else: - master_key_encrypted_b64encoded_len = int.from_bytes(tail_data, 'little') - if master_key_encrypted_b64encoded_len + 4 > fileobj_endpos: - raise CrypterCreatingError( - f'{repr(qmcv2_filething)} is not a valid QMCv2 file: ' - f'QMCv2 QTag data length ({master_key_encrypted_b64encoded_len + 4}) ' - f'is greater than file length ({fileobj_endpos})' + elif garble_key2 is None: + raise TypeError( + "argument 'garble_key2' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" ) - audio_encrypted_len = fileobj.seek(-(master_key_encrypted_b64encoded_len + 4), 2) - master_key_encrypted_b64encoded = fileobj.read(master_key_encrypted_b64encoded_len) - target_master_key = master_key - if target_master_key is None: - master_key_encrypted = b64decode(master_key_encrypted_b64encoded, validate=False) - if master_key_encrypted.startswith(b'QQMusic EncV2,Key:'): - missing_mix_key_msg = '{} is required for QMCv2 file ' \ - 'with master key encryption V2 decryption' - missed_mix_keys = None - if mix_key1 is None and mix_key2 is None: - missed_mix_keys = "'mix_key1' and 'mix_key2'" - elif mix_key1 is None: - missed_mix_keys = "'mix_key1'" - elif mix_key2 is None: - missed_mix_keys = "'mix_key2'" - if missed_mix_keys: - raise ValueError(missing_mix_key_msg.format(missed_mix_keys)) - target_master_key = QMCv2KeyEncryptV2( - simple_key, - mix_key1, - mix_key2 - ).decrypt(master_key_encrypted[18:]) - else: - target_master_key = QMCv2KeyEncryptV1(simple_key).decrypt(master_key_encrypted) - - fileobj.seek(0, 0) - initial_bytes = fileobj.read(audio_encrypted_len) - - if len(target_master_key) == 128: - cipher = Mask128(target_master_key) - warnings.warn(CrypterCreatingWarning( - 'maskey length is 128, most likely obtained by other means, ' - 'such as known plaintext attack. ' - 'Unable to recover and save the original master key from this key.' - ) - ) - elif len(target_master_key) == 256: - cipher = Mask128.from_qmcv2_key256(target_master_key) - elif len(target_master_key) == 512: - cipher = HardenedRC4(target_master_key) + target_master_key = QMCv2KeyEncryptV2( + core_key, garble_key1, garble_key2 + ).decrypt(master_key_encrypted) + else: + raise CrypterCreatingError( + f"unsupported master key encryption version {master_key_encryption_ver}" + ) else: - raise CrypterCreatingError( - 'invalid master key length: should be 128 (unrecommend), 256 or 512, ' - f'not {len(target_master_key)}' + target_master_key = master_key + cipher_ctor = _guess_cipher_ctor(target_master_key, is_encrypted=False) + + if encrypt_method in ('map', 'mask'): + cipher_ctor = Mask128 + elif encrypt_method == 'rc4': + cipher_ctor = HardenedRC4 + + if cipher_ctor is None: + raise TypeError( + "don't know which cipher to use, " + f"please try {cls.__name__}.open() again " + f"with argument 'encrypt_method'" ) - return cls(cipher, - initial_bytes, - simple_key=simple_key, - mix_key1=mix_key1, - mix_key2=mix_key2, - song_id=song_id, - song_mid=song_mid, - unknown_value1=unknown_value1 - ) - - if simple_key is not None: - simple_key = tobytes(simple_key) - if mix_key1 is not None: - mix_key1 = tobytes(mix_key1) - if mix_key2 is not None: - mix_key2 = tobytes(mix_key2) - if master_key is not None: - master_key = tobytes(master_key) + cipher = cipher_ctor(target_master_key) + fd.seek(0, 0) + inst = cls(cipher, fd.read(cipher_data_len)) + inst._extra_info = extra_info - if is_filepath(qmcv2_filething): - with open(qmcv2_filething, mode='rb') as qmcv2_fileobj: - instance = operation(qmcv2_fileobj) + return inst + + if isinstance(filething_or_info, tuple): + filething_or_info: tuple[Path | IO[bytes], QMCv2FileInfo | None] + if len(filething_or_info) != 2: + raise TypeError( + "first argument 'filething_or_info' must be a file path, a file object, " + "or a tuple of probe_qmc(), probe_qmcv2() returns" + ) + filething, fileinfo = filething_or_info else: - qmcv2_fileobj = verify_fileobj(qmcv2_filething, 'binary', - verify_readable=True, - verify_seekable=True - ) - instance = operation(qmcv2_fileobj) + filething, fileinfo = probe_qmcv2(filething_or_info) - instance._name = getattr(qmcv2_fileobj, 'name', None) + if fileinfo is None: + raise CrypterCreatingError( + f"{repr(filething)} is not a QMCv2 file" + ) + elif not isinstance(fileinfo, QMCv2FileInfo): + raise TypeError( + f"second element of the tuple must be QMCv2FileInfo or None, not {type(fileinfo).__name__}" + ) + + if isfilepath(filething): + with open(filething, mode='rb') as fileobj: + instance = operation(fileobj) + instance._name = Path(filething) + else: + fileobj = verify_fileobj(filething, 'binary', + verify_readable=True, + verify_seekable=True + ) + fileobj_sourcefile = getattr(fileobj, 'name', None) + instance = operation(fileobj) + + if fileobj_sourcefile is not None: + instance._name = Path(fileobj_sourcefile) return instance @@ -603,7 +977,9 @@ def to_file(self, mix_key1: BytesLike = None, mix_key2: BytesLike = None ) -> None: - """将当前 QMCv2 对象的内容保存到文件 ``qmcv2_filething``。 + """(已弃用,且将会在后续版本中删除。请尽快使用 ``QMCv2.save()`` 代替。) + + 将当前 QMCv2 对象的内容保存到文件 ``qmcv2_filething``。 第一个位置参数 ``qmcv2_filething`` 可以是文件路径(``str``、``bytes`` 或任何拥有方法 ``__fspath__()`` 的对象)。``qmcv2_filething`` @@ -627,103 +1003,205 @@ def to_file(self, 如果未提供这些参数,则会使用当前 QMCv2 对象的同名属性代替。 如果两者都为 ``None`` 或未提供,则会触发 ``CrypterSavingError``。 """ - - def operation(fileobj: IO[bytes]) -> None: - if tag_type == 'stag': - warnings.warn(CrypterSavingWarning( - 'the STag embedded in the file does not contain the master key. ' - 'You need to remember the master key yourself. ' - "Access the attribute 'self.master_key' to get the master key." + warnings.warn( + DeprecationWarning( + f'{type(self).__name__}.from_file() is deprecated, no longer used, ' + f'and may be removed in subsequent versions. ' + f'Use {type(self).__name__}.save() instead.' + ) + ) + with_extra_info = False + if isinstance(self.extra_info, (QMCv2QTag, QMCv2QTag)) and tag_type: + with_extra_info = True + if master_key_enc_ver == 1: + mix_key1 = None + mix_key2 = None + elif master_key_enc_ver == 2: + if mix_key1 is None: + mix_key1 = self.garble_key1 + if mix_key2 is None: + mix_key2 = self.garble_key2 + if mix_key1 is None and mix_key2 is None: + raise TypeError( + "argument 'mix_key1' and 'mix_key2' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" + ) + elif mix_key1 is None: + raise TypeError( + "argument 'mix_key1' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" ) + elif mix_key2 is None: + raise TypeError( + "argument 'mix_key2' is required to " + "decrypt the QMCv2 Key Encryption V2 protected master key" ) - fileobj.write(self.getvalue(nocryptlayer=True)) - fileobj.write(self.stag.to_bytes(with_tail=True)) - elif tag_type == 'qtag': - if self.qtag is None: - raise CrypterCreatingError("unable to save the file: cannot generate QTag") - fileobj.write(self.getvalue(nocryptlayer=True)) - fileobj.write(self.qtag.to_bytes(with_tail=True)) - elif tag_type is None: - target_simple_key = simple_key - if target_simple_key is None: - if self.simple_key is None: - raise CrypterSavingError( - "argument 'simple_key' and attribute self.simple_key is not available, " - 'but it is required for the master key encryption' + else: + raise ValueError("argument 'master_key_enc_ver' must be 1 or 2, " + f"not {master_key_enc_ver}" + ) + if simple_key is None: + simple_key = self.core_key + return self.save(core_key=simple_key, + filething=qmcv2_filething, + garble_key1=mix_key1, + garble_key2=mix_key2, + with_extra_info=with_extra_info + ) + + def save(self, + core_key: BytesLike = None, + filething: FilePath | IO[bytes] = None, + garble_key1: BytesLike = None, + garble_key2: BytesLike = None, + with_extra_info: bool = False + ) -> None: + """将当前对象保存为一个新 QMCv2 文件。 + + 第一个参数 ``core_key`` 一般是必需的,用于加密主密钥,以便嵌入到文件。 + 例外:参数 ``with_extra_info=True`` 且当前对象的属性 ``extra_info`` + 是一个 ``QMCv2STag`` 对象,此时它是可选的,其值会被忽略。 + + 第二个参数 ``filething`` 是可选的,如果提供此参数,需要是一个文件路径或文件对象。 + 可接受的文件路径类型包括:字符串、字节串、任何定义了 ``__fspath__()`` 方法的对象。 + 如果是文件对象,那么必须可读且可寻址(其 ``seekable()`` 方法返回 ``True``)。 + 如果未提供此参数,那么将会尝试使用当前对象的 ``source`` 属性;如果后者也不可用,则引发 + ``TypeError``。 + + 第三、第四个参数 ``garble_key1`` 和 ``garble_key2``,决定对主密钥进行加密的方法; + 如果提供,则需要两个一起提供,将会对主密钥采用 V2 加密;否则,对主密钥采用 V1 加密。 + 如果参数 ``with_extra_info=True`` 且当前对象的属性 ``extra_info`` + 是一个 ``QMCv2STag`` 对象,它们的值会被忽略。 + + 第五个参数 ``with_extra_info`` 如果为 ``True``,且当前对象的属性 ``extra_info`` 是 + ``QMCv2QTag`` 或 ``QMCv2STag`` 对象,那么这些对象将会被序列化后嵌入文件。 + + Args: + core_key: 核心密钥,用于加密主密钥,以便嵌入到文件 + filething: 目标文件的路径或文件对象 + garble_key1: 混淆密钥 1,用于使用 V2 加密方式加密主密钥 + garble_key2: 混淆密钥 2,用于使用 V2 加密方式加密主密钥 + with_extra_info: 是否在文件末尾添加额外信息(self.extra_info) + + Raises: + TypeError: 当前对象的属性 source 和参数 filething 都为空,无法保存文件;参数 core_key 和 master_key 都未提供,或者缺少 garble_key1 或 garble_key2 用于使用 V2 方式加密主密钥 + """ + if core_key is not None: + core_key = tobytes(core_key) + if garble_key1 is not None: + garble_key1 = tobytes(garble_key1) + if garble_key2 is not None: + garble_key2 = tobytes(garble_key2) + + def operation(fd: IO[bytes]) -> None: + fd.seek(0, 0) + extra_info = self.extra_info + + if with_extra_info: + if isinstance(extra_info, QMCv2STag): + warnings.warn( + CrypterSavingWarning( + "Extra info (self.extra_info) will be export to STag data, " + "which cannot save the master key. " + "So you should save the master key in other way. " + "Use 'self.master_key' to get it." ) - target_simple_key = self.simple_key - if len(self.master_key) == 128: - raise CrypterSavingError( - 'master key is not available: ' - 'maskey length is 128, most likely obtained by other means, ' - 'such as known plaintext attack. ' - 'Unable to recover the original master key from this key.' ) - master_key = self.master_key - if master_key_enc_ver == 1: - master_key_encrypted = QMCv2KeyEncryptV1(target_simple_key).encrypt(master_key) - elif master_key_enc_ver == 2: - missing_mix_key_msg = '{names} not available, but {appell} required for ' \ - 'the master key encryption V2 encryption' - missed_mix_keys_appell = {} - target_mix_key1 = mix_key1 - if target_mix_key1 is None: - target_mix_key1 = self.mix_key1 - target_mix_key2 = mix_key2 - if target_mix_key2 is None: - target_mix_key2 = self.mix_key2 - if target_mix_key1 is None and target_mix_key2 is None: - missed_mix_keys_appell['names'] = \ - "argument 'mix_key1' and attribute self.mix_key1, " \ - "argument 'mix_key2' and attribute self.mix_key2 are" - missed_mix_keys_appell['appell'] = 'they are' - elif target_mix_key1 is None or target_mix_key2 is None: - missed_mix_keys_appell['appell'] = 'it is' - if target_mix_key1 is None: - missed_mix_keys_appell['names'] = \ - "argument 'mix_key1' and attribute self.mix_key1 is" - elif target_mix_key2 is None: - missed_mix_keys_appell['names'] = \ - "argument 'mix_key2' and attribute self.mix_key2 is" - print(missed_mix_keys_appell) - if missed_mix_keys_appell: - raise CrypterSavingError( - missing_mix_key_msg.format_map(missed_mix_keys_appell) - ) - master_key_encrypted = b'QQMusic EncV2,Key:' + QMCv2KeyEncryptV2( - target_simple_key, target_mix_key1, target_mix_key2 - ).encrypt(master_key) - else: - raise ValueError("argument 'master_key_enc_ver' must be 1 or 2, " - f"not {master_key_enc_ver}" - ) + tag_serialized = extra_info.dump() + fd.write(self.getvalue(nocryptlayer=True)) + fd.write(tag_serialized) + fd.write(len(tag_serialized).to_bytes(4, 'big')) + fd.write(b'STag') + + return + + master_key = self.master_key + if core_key is None: + raise TypeError( + "argument 'core_key' is required to encrypt the master key " + "before embed to file" + ) + if with_extra_info: + if isinstance(extra_info, QMCv2QTag): + master_key_encrypted = QMCv2KeyEncryptV1(core_key).encrypt(master_key) + master_key_encrypted_b64encoded = b64encode(master_key_encrypted) + tag_serialized = extra_info.dump(master_key_encrypted_b64encoded) + fd.write(self.getvalue(nocryptlayer=True)) + fd.write(tag_serialized) + fd.write(len(tag_serialized).to_bytes(4, 'big')) + fd.write(b'QTag') + + return + + if garble_key1 is None and garble_key2 is None: # QMCv2 KeyencV1 + master_key_encrypted = QMCv2KeyEncryptV1(core_key).encrypt(master_key) master_key_encrypted_b64encoded = b64encode(master_key_encrypted) - master_key_encrypted_b64encoded_len = len(master_key_encrypted_b64encoded) - fileobj.write(self.getvalue(nocryptlayer=True)) - fileobj.write(master_key_encrypted_b64encoded) - fileobj.write(master_key_encrypted_b64encoded_len.to_bytes(4, 'little')) - elif isinstance(tag_type, str): - raise ValueError("argument 'tag_type' must be 'qtag', 'stag' or None, " - f"not {tag_type}" - ) - else: - raise TypeError(f"argument 'tag_type' must be str or None, " - f"not {type(tag_type).__name__}" - ) - - master_key_enc_ver = toint_nofloat(master_key_enc_ver) - if simple_key is not None: - simple_key = tobytes(simple_key) - if mix_key1 is not None: - mix_key1 = tobytes(mix_key1) - if mix_key2 is not None: - mix_key2 = tobytes(mix_key2) - - if is_filepath(qmcv2_filething): - with open(qmcv2_filething, mode='wb') as qmcv2_fileobj: - operation(qmcv2_fileobj) + else: # QMCv2 KeyEncV2 + if garble_key1 is None: + raise TypeError( + "argument 'garble_key1' is required to encrypt the master key " + "with QMCv2 Key Encryption V2 before embed to file" + ) + if garble_key2 is None: + raise TypeError( + "argument 'garble_key2' is required to encrypt the master key " + "with QMCv2 Key Encryption V2 before embed to file" + ) + master_key_encrypted = QMCv2KeyEncryptV2( + core_key, garble_key1, garble_key2 + ).encrypt(master_key) + master_key_encrypted_b64encoded = b64encode( + b'QQMusic EncV2,Key:' + master_key_encrypted + ) + fd.write(self.getvalue(nocryptlayer=True)) + fd.write(master_key_encrypted_b64encoded) + fd.write(len(master_key_encrypted_b64encoded).to_bytes(4, 'little')) + + return + + if filething is None: + if self.source is None: + raise TypeError( + "attribute 'self.source' and argument 'filething' are empty, " + "don't know which file to save to" + ) + filething = self.source + + if isfilepath(filething): + with open(filething, mode='wb') as fileobj: + return operation(fileobj) else: - qmcv2_fileobj = verify_fileobj(qmcv2_filething, 'binary', - verify_writable=True - ) - operation(qmcv2_fileobj) + fileobj = verify_fileobj(filething, 'binary', + verify_seekable=True, + verify_writable=True + ) + return operation(fileobj) + + @classmethod + def new(cls, encrypt_method: Literal['map', 'mask', 'rc4'], /): + """返回一个空 QMCv2 对象。 + + 第一个位置参数 ``encrypt_method`` 是必需的,用于指示使用的加密方式,支持以下值: + + - ``'map'`` 或 ``'mask'`` - 掩码表(Mask128) + - ``'rc4'`` - 强化版 RC4(HardenedRC4) + + Raises: + ValueError: encrypt_method 的值不符合上述条件 + """ + if encrypt_method in ('map', 'mask'): + cipher = Mask128.from_qmcv2_key256(make_random_ascii_string(256).encode('ascii')) + elif encrypt_method == 'rc4': + cipher = HardenedRC4(make_random_ascii_string(512).encode('ascii')) + elif isinstance(encrypt_method, str): + raise ValueError( + f"argument 'encrypt_method' must be 'map', 'mask', or 'rc4', " + f"not {repr(encrypt_method)}" + ) + else: + raise TypeError( + f"argument 'encrypt_method' must be str, " + f"not {type(encrypt_method).__name__}" + ) + + return cls(cipher) diff --git a/src/libtakiyasha/qmc/qmcconsts.py b/src/libtakiyasha/qmc/qmcconsts.py index dbafe4d..245fa66 100644 --- a/src/libtakiyasha/qmc/qmcconsts.py +++ b/src/libtakiyasha/qmc/qmcconsts.py @@ -5,8 +5,12 @@ from pathlib import Path from typing import Final +from ..miscutils import BINARIES_ROOTDIR + __all__ = ['KEY256_MAPPING'] +MODULE_BINARIES_ROOTDIR = BINARIES_ROOTDIR / 'qmc' / Path(__file__).stem + # KEY256_MAPPING = [[]] * 256 # # for i in range(128): @@ -17,5 +21,5 @@ # KEY256_MAPPING[real_idx].append(i) # # KEY256_MAPPING 可使用以上代码生成 -with open(Path(__file__).parent / 'binaries/Key256MappingData', 'rb') as f: +with open(MODULE_BINARIES_ROOTDIR / 'Key256MappingData', 'rb') as f: KEY256_MAPPING: Final[list[list[int]]] = pickle.load(f) diff --git a/src/libtakiyasha/qmc/qmcdataciphers.py b/src/libtakiyasha/qmc/qmcdataciphers.py index 9ebf323..69e644a 100644 --- a/src/libtakiyasha/qmc/qmcdataciphers.py +++ b/src/libtakiyasha/qmc/qmcdataciphers.py @@ -2,28 +2,83 @@ from __future__ import annotations from functools import lru_cache -from typing import Generator +from typing import Generator, Literal from .qmcconsts import KEY256_MAPPING -from ..common import StreamCipherSkel +from ..prototypes import KeyStreamBasedStreamCipherSkel from ..typedefs import BytesLike, IntegerLike -from ..typeutils import CachedClassInstanceProperty, tobytes, toint_nofloat +from ..typeutils import CachedClassInstanceProperty, tobytes, toint -__all__ = [ - 'HardenedRC4', - 'Mask128' -] +class Mask128(KeyStreamBasedStreamCipherSkel): + def __init__(self, mask128: BytesLike, /): + self._mask128 = tobytes(mask128) + if len(self._mask128) != 128: + raise ValueError(f"invalid mask length: should be 128, got {len(self._mask128)}") -class Mask128(StreamCipherSkel): - @property - def original_master_key(self) -> bytes | None: - if hasattr(self, '_original_master_key'): - return self._original_master_key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._mask128 + elif keyname == 'original': + return getattr(self, '_original_qmcv2_key256', None) - @property - def mask128(self) -> bytes: - return self._mask128 + @classmethod + def cls_keystream(cls, + mask128: BytesLike, + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + mask = tobytes(mask128) + if len(mask) != 128: + raise ValueError(f"invalid mask length: should be 128, got {len(mask)}") + nbytes = toint(nbytes) + offset = toint(offset) + if offset < 0: + raise ValueError("third argument 'offset' must be a non-negative integer") + if nbytes < 0: + raise ValueError("second argument 'nbytes' must be a non-negative integer") + + firstblk_data = mask * 256 # 前 32768 字节 + secondblk_data = firstblk_data[1:-1] # 第 32769 至 65535 字节 + startblk_data = firstblk_data + secondblk_data # 初始块:前 65535 字节 + startblk_len = len(startblk_data) + commonblk_data = firstblk_data[:-1] # 普通块:第 65536 字节往后每一个 32767 大小的块 + commonblk_len = len(commonblk_data) + + if 0 <= offset < startblk_len: + max_target_in_startblk_len = startblk_len - offset + target_in_commonblk_len = nbytes - max_target_in_startblk_len + target_in_startblk_len = min(nbytes, max_target_in_startblk_len) + yield from startblk_data[offset:offset + target_in_startblk_len] + if target_in_commonblk_len <= 0: + return + else: + offset = 0 + else: + offset -= startblk_len + target_in_commonblk_len = nbytes + + target_offset_in_commonblk = offset % commonblk_len + if target_offset_in_commonblk == 0: + target_before_commonblk_area_len = 0 + else: + target_before_commonblk_area_len = commonblk_len - target_offset_in_commonblk + yield from commonblk_data[target_offset_in_commonblk:target_offset_in_commonblk + target_before_commonblk_area_len] + target_in_commonblk_len -= target_before_commonblk_area_len + + target_overrided_whole_commonblk_count = target_in_commonblk_len // commonblk_len + target_after_commonblk_area_len = target_in_commonblk_len % commonblk_len + + for _ in range(target_overrided_whole_commonblk_count): + yield from commonblk_data + yield from commonblk_data[:target_after_commonblk_area_len] + + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + yield from self.cls_keystream(self._mask128, nbytes, offset) @classmethod def from_qmcv1_mask44(cls, mask44: BytesLike) -> Mask128: @@ -74,74 +129,36 @@ def from_qmcv2_key256(cls, key256: BytesLike) -> Mask128: mask128[idx128] = ((value << rotate) % 256) | ((value >> rotate) % 256) instance = cls(mask128) - instance._original_master_key = key256 + instance._original_qmcv2_key256 = key256 return instance - def __init__(self, mask128: BytesLike, /): - self._mask128 = tobytes(mask128) - if len(self._mask128) != 128: - raise ValueError(f"invalid mask length: should be 128, got {len(self._mask128)}") - - @classmethod - def cls_keystream(cls, - offset: IntegerLike, - length: IntegerLike, /, - mask128: BytesLike - ) -> Generator[int, None, None]: - mask128: bytes = tobytes(mask128) - if len(mask128) != 128: - raise ValueError(f"invalid mask length (should be 128, got {len(mask128)})") - firstblk_data = mask128 * 256 # 前 32768 字节 - secondblk_data = firstblk_data[1:-1] # 第 32769 至 65535 字节 - startblk_data = firstblk_data + secondblk_data # 初始块:前 65535 字节 - startblk_len = len(startblk_data) - commonblk_data = firstblk_data[:-1] # 普通块:第 65536 字节往后每一个 32767 大小的块 - commonblk_len = len(commonblk_data) - offset = toint_nofloat(offset) - length = toint_nofloat(length) - if offset < 0: - raise ValueError("first argument 'offset' must be a non-negative integer") - if length < 0: - raise ValueError("second argument 'length' must be a non-negative integer") - if 0 <= offset < startblk_len: - max_target_in_startblk_len = startblk_len - offset - target_in_commonblk_len = length - max_target_in_startblk_len - target_in_startblk_len = min(length, max_target_in_startblk_len) - yield from startblk_data[offset:offset + target_in_startblk_len] - if target_in_commonblk_len <= 0: - return - else: - offset = 0 - else: - offset -= startblk_len - target_in_commonblk_len = length +class HardenedRC4(KeyStreamBasedStreamCipherSkel): + def __init__(self, key: BytesLike, /): + self._key = tobytes(key) - target_offset_in_commonblk = offset % commonblk_len - if target_offset_in_commonblk == 0: - target_before_commonblk_area_len = 0 - else: - target_before_commonblk_area_len = commonblk_len - target_offset_in_commonblk - yield from commonblk_data[target_offset_in_commonblk:target_offset_in_commonblk + target_before_commonblk_area_len] - target_in_commonblk_len -= target_before_commonblk_area_len - - target_overrided_whole_commonblk_count = target_in_commonblk_len // commonblk_len - target_after_commonblk_area_len = target_in_commonblk_len % commonblk_len - - for _ in range(target_overrided_whole_commonblk_count): - yield from commonblk_data - yield from commonblk_data[:target_after_commonblk_area_len] + key_len = len(self._key) + if key_len == 0: + raise ValueError("first argument 'key' cannot be an empty bytestring") + if b'\x00' in self._key: + raise ValueError("first argument 'key' cannot contain null bytes") - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - yield from self.cls_keystream(offset, length, mask128=self._mask128) + self._box = bytearray(i % 256 for i in range(key_len)) + j = 0 + for i in range(key_len): + j = (j + self._box[i] + self._key[i]) % key_len + self._box[i], self._box[j] = self._box[j], self._box[i] + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._key -class HardenedRC4(StreamCipherSkel): @property + @lru_cache def hash_base(self) -> int: base = 1 - key = self._master_key + key = self._key for i in range(len(key)): v: int = key[i] @@ -161,29 +178,10 @@ def first_segment_size(self) -> int: def common_segment_size(self) -> int: return 5120 - @property - def master_key(self) -> bytes: - return self._master_key - - def __init__(self, master_key: BytesLike, /): - self._master_key = tobytes(master_key) - - key_len = len(self._master_key) - if key_len == 0: - raise ValueError("first argument 'master_key' cannot be an empty bytestring") - if b'\x00' in self._master_key: - raise ValueError("b'\\x00' cannot appear in the first argument 'master_key'") - - self._box = bytearray(i % 256 for i in range(key_len)) - j = 0 - for i in range(key_len): - j = (j + self._box[i] + self._master_key[i % key_len]) % key_len - self._box[i], self._box[j] = self._box[j], self._box[i] - @lru_cache(maxsize=65536) def _get_segment_skip(self, value: int) -> int: - key = self._master_key - key_len = len(self._master_key) + key = self._key + key_len = len(self._key) seed = key[value % key_len] idx = int(self.hash_base / ((value + 1) * seed) * 100) @@ -194,7 +192,7 @@ def _yield_first_segment_keystream(self, blksize: int, offset: int ) -> Generator[int, None, None]: - key = self._master_key + key = self._key for i in range(offset, offset + blksize): yield key[self._get_segment_skip(i)] @@ -202,7 +200,7 @@ def _yield_common_segment_keystream(self, blksize: int, offset: int ) -> Generator[int, None, None]: - key_len = len(self._master_key) + key_len = len(self._key) box = self._box.copy() j, k = 0, 0 @@ -216,14 +214,18 @@ def _yield_common_segment_keystream(self, if i >= 0: yield box[(box[j] + box[k]) % key_len] - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - pending = toint_nofloat(length) + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + pending = toint(nbytes) done = 0 - offset = toint_nofloat(offset) + offset = toint(offset) if offset < 0: - raise ValueError("first argument 'offset' must be a non-negative integer") + raise ValueError("third argument 'offset' must be a non-negative integer") if pending < 0: - raise ValueError("second argument 'length' must be a non-negative integer") + raise ValueError("second argument 'nbytes' must be a non-negative integer") def mark(p: int) -> None: nonlocal pending, done, offset @@ -251,4 +253,4 @@ def mark(p: int) -> None: mark(self.common_segment_size) if pending > 0: - yield from self._yield_common_segment_keystream(length - done, offset) + yield from self._yield_common_segment_keystream(nbytes - done, offset) diff --git a/src/libtakiyasha/qmc/qmckeyciphers.py b/src/libtakiyasha/qmc/qmckeyciphers.py index 455cf9d..7174708 100644 --- a/src/libtakiyasha/qmc/qmckeyciphers.py +++ b/src/libtakiyasha/qmc/qmckeyciphers.py @@ -4,40 +4,40 @@ from base64 import b64decode, b64encode from math import tan -from ..common import CipherSkel from ..exceptions import CipherDecryptingError, CipherEncryptingError +from ..prototypes import CipherSkel from ..stdciphers import TarsCppTCTEAWithModeCBC from ..typedefs import BytesLike from ..typeutils import tobytes __all__ = [ - 'make_simple_key', + 'make_core_key', 'QMCv2KeyEncryptV1', 'QMCv2KeyEncryptV2' ] -def make_simple_key(salt: int, length: int) -> bytes: +def make_core_key(salt: int, length: int) -> bytes: return bytes(int(abs(tan(salt + _ * 0.1) * 100)) for _ in range(length)) class QMCv2KeyEncryptV1(CipherSkel): - @property - def simple_key(self) -> bytes: - return self._simple_key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._core_key - def __init__(self, simple_key: BytesLike, /): - self._simple_key = tobytes(simple_key) + def __init__(self, key: BytesLike, /): + self._core_key = tobytes(key) self._half_of_keysize = TarsCppTCTEAWithModeCBC.master_key_size // 2 - if len(self._simple_key) != self._half_of_keysize: - raise ValueError(f"invalid length of simple key: " - f"should be {self._half_of_keysize}, not {len(self._simple_key)}" + if len(self._core_key) != self._half_of_keysize: + raise ValueError(f"invalid length of core key: " + f"should be {self._half_of_keysize}, not {len(self._core_key)}" ) self._key_buf = bytearray(TarsCppTCTEAWithModeCBC.master_key_size) for idx in range(TarsCppTCTEAWithModeCBC.blocksize): - self._key_buf[idx << 1] = self._simple_key[idx] + self._key_buf[idx << 1] = self._core_key[idx] def encrypt(self, plaindata: BytesLike, /) -> bytes: plaindata = tobytes(plaindata) @@ -76,26 +76,26 @@ def decrypt(self, cipherdata: BytesLike, /) -> bytes: class QMCv2KeyEncryptV2(QMCv2KeyEncryptV1): - @property - def mix_key1(self) -> bytes: - return self._mix_key1 - - @property - def mix_key2(self) -> bytes: - return self._mix_key2 - - def __init__(self, simple_key: BytesLike, mix_key1: BytesLike, mix_key2: BytesLike, /): - self._mix_key1 = tobytes(mix_key1) - self._mix_key2 = tobytes(mix_key2) - - self._encrypt_stage1_decrypt_stage2_tea_cipher = TarsCppTCTEAWithModeCBC(self._mix_key2, + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._core_key + elif keyname == 'garble1': + return self._garble_key1 + elif keyname == 'garble2': + return self._garble_key2 + + def __init__(self, key: BytesLike, garble_key1: BytesLike, garble_key2: BytesLike, /): + self._garble_key1 = tobytes(garble_key1) + self._garble_key2 = tobytes(garble_key2) + + self._encrypt_stage1_decrypt_stage2_tea_cipher = TarsCppTCTEAWithModeCBC(self._garble_key2, rounds=32 ) - self._encrypt_stage2_decrypt_stage1_tea_cipher = TarsCppTCTEAWithModeCBC(self._mix_key1, + self._encrypt_stage2_decrypt_stage1_tea_cipher = TarsCppTCTEAWithModeCBC(self._garble_key1, rounds=32 ) - super().__init__(simple_key) + super().__init__(key) def encrypt(self, plaindata: BytesLike, /) -> bytes: plaindata = tobytes(plaindata) @@ -104,11 +104,15 @@ def encrypt(self, plaindata: BytesLike, /) -> bytes: qmcv2_key_encv1_key_encrypted_b64encoded = b64encode(qmcv2_key_encv1_key_encrypted) try: - encrypt_stage1 = self._encrypt_stage1_decrypt_stage2_tea_cipher.encrypt(qmcv2_key_encv1_key_encrypted_b64encoded) + encrypt_stage1 = self._encrypt_stage1_decrypt_stage2_tea_cipher.encrypt( + qmcv2_key_encv1_key_encrypted_b64encoded + ) except Exception as exc: raise CipherEncryptingError('QMCv2 key encrypt v2 stage 1 key encrypt failed') from exc try: - encrypt_stage2 = self._encrypt_stage2_decrypt_stage1_tea_cipher.encrypt(encrypt_stage1) + encrypt_stage2 = self._encrypt_stage2_decrypt_stage1_tea_cipher.encrypt( + encrypt_stage1 + ) except Exception as exc: raise CipherEncryptingError('QMCv2 key encrypt v2 stage 2 key encrypt failed') from exc @@ -119,11 +123,15 @@ def decrypt(self, cipherdata: BytesLike, /) -> bytes: # cipherdata 应当是 b64decode 之后,去除了开头 18 个字符的结果 try: - decrypt_stage1: bytes = self._encrypt_stage2_decrypt_stage1_tea_cipher.decrypt(cipherdata, zero_check=True) + decrypt_stage1: bytes = self._encrypt_stage2_decrypt_stage1_tea_cipher.decrypt( + cipherdata, zero_check=True + ) except Exception as exc: raise CipherDecryptingError('QMCv2 key encrypt v2 stage 1 key decrypt failed') from exc try: - decrypt_stage2: bytes = self._encrypt_stage1_decrypt_stage2_tea_cipher.decrypt(decrypt_stage1, zero_check=True) # 实际上就是 QMCv2 Key Encrypt V1 的密钥 + decrypt_stage2: bytes = self._encrypt_stage1_decrypt_stage2_tea_cipher.decrypt( + decrypt_stage1, zero_check=True + ) # 实际上就是 QMCv2 Key Encrypt V1 的密钥 except Exception as exc: raise CipherDecryptingError('QMCv2 key encrypt v2 stage 2 key decrypt failed') from exc diff --git a/src/libtakiyasha/stdciphers.py b/src/libtakiyasha/stdciphers.py index 4a3f38a..8422a65 100644 --- a/src/libtakiyasha/stdciphers.py +++ b/src/libtakiyasha/stdciphers.py @@ -8,16 +8,16 @@ import io except ImportError: import _pyio as io -from typing import Generator +from typing import Generator, Literal from pyaes import AESModeOfOperationECB from pyaes.util import append_PKCS7_padding, strip_PKCS7_padding -from .common import StreamCipherSkel, CipherSkel +from .prototypes import KeyStreamBasedStreamCipherSkel, CipherSkel from .exceptions import CipherDecryptingError from .typedefs import IntegerLike, BytesLike from .miscutils import bytestrxor -from .typeutils import CachedClassInstanceProperty, tobytes, toint_nofloat +from .typeutils import CachedClassInstanceProperty, tobytes, toint __all__ = [ 'StreamedAESWithModeECB', @@ -34,9 +34,9 @@ class StreamedAESWithModeECB(CipherSkel): def blocksize(self) -> int: return 16 - @property - def master_key(self) -> bytes: - return self._key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._key def __init__(self, key: BytesLike, /) -> None: self._key = tobytes(key) @@ -59,10 +59,9 @@ class TEAWithModeECB(CipherSkel): def blocksize(self) -> int: return 16 - @property - def master_key(self) -> bytes: - """主要的密钥。""" - return self._key + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._key def __init__(self, key: BytesLike, @@ -71,8 +70,8 @@ def __init__(self, magic_number: IntegerLike = 0x9e3779b9 ) -> None: self._key = tobytes(key) - self._rounds = toint_nofloat(rounds) - self._delta = toint_nofloat(magic_number) + self._rounds = toint(rounds) + self._delta = toint(magic_number) if len(self._key) != self.blocksize: raise ValueError(f"invalid key length: should be {self.blocksize}, not {len(self._key)}") @@ -91,7 +90,8 @@ def transvalues(cls, data: bytes, key: bytes) -> tuple[int, int, int, int, int, return v0, v1, k0, k1, k2, k3 def encrypt(self, plaindata: BytesLike, /) -> bytes: - v0, v1, k0, k1, k2, k3 = self.transvalues(tobytes(plaindata), self.master_key) + master_key = self.getkey('master') + v0, v1, k0, k1, k2, k3 = self.transvalues(tobytes(plaindata), master_key) delta = self._delta rounds = self._rounds @@ -108,7 +108,8 @@ def encrypt(self, plaindata: BytesLike, /) -> bytes: return v0.to_bytes(4, 'big') + v1.to_bytes(4, 'big') def decrypt(self, cipherdata: BytesLike, /) -> bytes: - v0, v1, k0, k1, k2, k3 = self.transvalues(tobytes(cipherdata), self.master_key) + master_key = self.getkey('master') + v0, v1, k0, k1, k2, k3 = self.transvalues(tobytes(cipherdata), master_key) delta = self._delta rounds = self._rounds @@ -126,6 +127,10 @@ def decrypt(self, cipherdata: BytesLike, /) -> bytes: class TarsCppTCTEAWithModeCBC(CipherSkel): + def getkey(self, keyname: str = 'master') -> bytes | None: + if keyname == 'master': + return self._lower_level_tea_cipher.getkey('master') + @CachedClassInstanceProperty def blocksize(self) -> int: return 8 @@ -142,11 +147,6 @@ def salt_len(self) -> int: def zero_len(self) -> int: return 7 - @property - def master_key(self) -> bytes: - """主要的密钥。""" - return self._lower_level_tea_cipher.master_key - @property def lower_level_cipher(self) -> TEAWithModeECB: """使用的下层 Cipher。""" @@ -166,8 +166,8 @@ def __init__(self, magic_number: 加/解密使用的魔数 """ key = tobytes(key) - rounds = toint_nofloat(rounds) - magic_number = toint_nofloat(magic_number) + rounds = toint(rounds) + magic_number = toint(magic_number) if len(key) != self.master_key_size: raise ValueError(f"invalid key length {len(key)}: " f"should be {self.master_key_size}, not {len(key)}" @@ -353,11 +353,7 @@ def crypt_block() -> None: return bytes(out_buf) -class ARC4(StreamCipherSkel): - @property - def master_key(self) -> bytes: - return self._key - +class ARC4(KeyStreamBasedStreamCipherSkel): def __init__(self, key: BytesLike, /) -> None: """标准的 RC4 加密算法实现。 @@ -390,13 +386,21 @@ def __init__(self, key: BytesLike, /) -> None: self._meta_keystream = bytes(meta_keystream) - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Generator[int, None, None]: - offset = toint_nofloat(offset) - length = toint_nofloat(length) + def getkey(self, keyname: str = 'master') -> bytes: + if keyname == 'master': + return self._key + + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike, / + ) -> Generator[int, None, None]: + offset = toint(offset) + nbytes = toint(nbytes) if offset < 0: - raise ValueError("first argument 'offset' must be a non-negative integer") - if length < 0: - raise ValueError("second argument 'length' must be a non-negative integer") + raise ValueError("third argument 'offset' must be a non-negative integer") + if nbytes < 0: + raise ValueError("second argument 'nbytes' must be a non-negative integer") - for i in range(offset, offset + length): + for i in range(offset, offset + nbytes): yield self._meta_keystream[i % 256] diff --git a/src/libtakiyasha/typedefs.py b/src/libtakiyasha/typedefs.py index 96c7982..60136a5 100644 --- a/src/libtakiyasha/typedefs.py +++ b/src/libtakiyasha/typedefs.py @@ -4,7 +4,7 @@ import array import mmap from os import PathLike -from typing import ByteString, Iterable, Protocol, Sequence, SupportsBytes, SupportsIndex, SupportsInt, TypeVar, Union, runtime_checkable +from typing import ByteString, Iterable, Iterator, Literal, Protocol, Sequence, SupportsBytes, SupportsIndex, SupportsInt, TypeVar, Union, runtime_checkable __all__ = [ 'T', @@ -20,6 +20,7 @@ 'FilePath', 'CipherProto', 'StreamCipherProto', + 'KeyStreamBasedStreamCipherProto', 'StreamCipherBasedCryptedIOProto' ] @@ -41,6 +42,9 @@ @runtime_checkable class CipherProto(Protocol): + def getkey(self, keyname: str = 'master') -> bytes | None: + raise NotImplementedError + def encrypt(self, plaindata: BytesLike, /) -> bytes: raise NotImplementedError @@ -50,7 +54,26 @@ def decrypt(self, cipherdata: BytesLike, /) -> bytes: @runtime_checkable class StreamCipherProto(Protocol): - def keystream(self, offset: IntegerLike, length: IntegerLike, /) -> Iterable[int]: + def getkey(self, keyname: str = 'master') -> bytes | None: + raise NotImplementedError + + def encrypt(self, plaindata: BytesLike, offset: IntegerLike = 0, /) -> bytes: + raise NotImplementedError + + def decrypt(self, cipherdata: BytesLike, offset: IntegerLike = 0, /) -> bytes: + raise NotImplementedError + + +@runtime_checkable +class KeyStreamBasedStreamCipherProto(Protocol): + def getkey(self, keyname: str = 'master') -> bytes | None: + raise NotImplementedError + + def keystream(self, + operation: Literal['encrypt', 'decrypt'], + nbytes: IntegerLike, + offset: IntegerLike = 0, / + ) -> Iterator[int]: raise NotImplementedError def encrypt(self, plaindata: BytesLike, offset: IntegerLike = 0, /) -> bytes: @@ -62,6 +85,14 @@ def decrypt(self, cipherdata: BytesLike, offset: IntegerLike = 0, /) -> bytes: @runtime_checkable class StreamCipherBasedCryptedIOProto(Protocol): + @property + def cipher(self) -> StreamCipherProto | KeyStreamBasedStreamCipherProto: + raise NotImplementedError + + @property + def master_key(self) -> bytes | None: + raise NotImplementedError + def read(self, size: IntegerLike = -1, /) -> bytes: raise NotImplementedError diff --git a/src/libtakiyasha/typeutils.py b/src/libtakiyasha/typeutils.py index 2ebe48e..037152e 100644 --- a/src/libtakiyasha/typeutils.py +++ b/src/libtakiyasha/typeutils.py @@ -11,8 +11,8 @@ 'CachedClassInstanceProperty', 'tobytes', 'tobytearray', - 'toint_nofloat', - 'is_filepath', + 'toint', + 'isfilepath', 'verify_fileobj' ] @@ -128,55 +128,57 @@ def __delete__(self, obj: T) -> None: super().__delete__(obj) -def tobytes(byteslike: BytesLike) -> bytes: - """尝试将 ``byteslike`` 转换为 ``bytes``。 +def tobytes(byteslike: BytesLike, /) -> bytes: + """一个 ``bytes()`` 的包装器。 - 对 ``int`` 类型的对象不适用。如果输入这样的值,会触发 ``TypeError``。 + 本函数尝试将 ``byteslike`` 转换为 ``bytes``。 + + 和 ``bytes()`` 不一样,本函数不支持通过指定长度的方式创建 ``bytes`` + 对象,即不接受整数类型 ``int`` 对象作为参数。如果输入此类对象,将会引发 + ``TypeError``。同时,本函数也不承担对字符串的编码。 """ - if isinstance(byteslike, int): - # 防止出现 bytes(1000) 这样的情况 + if isinstance(byteslike, int) and not hasattr(byteslike, '__bytes__'): + # 防止出现 bytes(114514) 这样的情况 raise TypeError(f"a bytes-like object is required, not '{type(byteslike).__name__}'") - elif isinstance(byteslike, bytes): - return byteslike else: return bytes(byteslike) -def tobytearray(byteslike: BytesLike) -> bytearray: - """尝试将 ``byteslike`` 转换为 ``bytearray``。 +def tobytearray(byteslike: BytesLike, /) -> bytearray: + """一个 ``bytearray()`` 的包装器。 + + 本函数尝试将 ``byteslike`` 转换为 ``bytearray``。 - 对 ``int`` 类型的对象不适用。如果输入这样的值,会触发 ``TypeError``。 + 和 ``bytearray()`` 不一样,本函数不支持通过指定长度的方式创建 ``bytearray`` + 对象,即不接受整数类型 ``int`` 对象作为参数。如果输入此类对象,将会引发 + ``TypeError``。同时,本函数也不承担对字符串的编码。 """ - if isinstance(byteslike, int): - # 防止出现 bytearray(1000) 这样的情况 + if isinstance(byteslike, int) and not hasattr(byteslike, '__bytes__'): + # 防止出现 bytearray(114514) 这样的情况 raise TypeError(f"a bytes-like object is required, not '{type(byteslike).__name__}'") - elif isinstance(byteslike, bytearray): - return byteslike else: return bytearray(byteslike) -def toint_nofloat(integerlike: IntegerLike) -> int: - """尝试将 ``integerlike`` 转换为 ``int``。 +def toint(integerlike: IntegerLike, /) -> int: + """一个 ``int()`` 的包装器。 - 对 ``float`` 类型或拥有 ``__float__`` 属性的对象不适用。 - 如果输入这样的值,会触发 ``TypeError``。 + 尽管 ``int()`` 可以转换浮点数,但是本函数不接受浮点数。如果输入此类对象,将会引发 + ``TypeError``。 """ if isinstance(integerlike, float): raise TypeError(f"'{type(integerlike).__name__}' object cannot be interpreted as an integer") - elif isinstance(integerlike, int): - return integerlike else: return int(integerlike) -def is_filepath(obj) -> bool: +def isfilepath(obj, /) -> bool: """判断对象 ``obj`` 是否可以被视为文件路径。 只有 ``str``、``bytes`` 类型,或者拥有 ``__fspath__`` 属性的对象,才会被视为文件路径。 """ - return isinstance(obj, (str, bytes)) or hasattr(obj, '__fspath__') + return isinstance(obj, (str, bytes, bytearray)) or hasattr(obj, '__fspath__') def verify_fileobj(fileobj: IO[str | bytes],