Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sources): Source bundles for debug files #154

Merged
merged 79 commits into from
Jul 10, 2019
Merged

feat(sources): Source bundles for debug files #154

merged 79 commits into from
Jul 10, 2019

Conversation

jan-auer
Copy link
Member

@jan-auer jan-auer commented Jun 24, 2019

This PR contains two major additions to the debuginfo crate:

  • DebugSession::source_by_path now exposes source code contents that are stored within object files. However, this is not implemented for any of the existing object file formats yet. The information whether an object file might even contain sources is exposed via ObjectLike::has_source.

  • A new object file format, SourceBundle, allows for platform-independent bundling of source context. It is a ZIP archive containing all source files referenced by an object file. Currently, this is derived by walking the debug information in the source file.

SourceBundle

Source bundles are a ZIP zip files with a well-defined internal structure. First it contains a binary header, followed by a ZIP stream:

static BUNDLE_MAGIC: [u8; 4] = *b"SYSB";

#[repr(C, packed)]
struct SourceBundleHeader {
    magic: [u8; 4], // Magic bytes header.
    version: u32,   // Version of the bundle.
}

The internal structure of the ZIP file is as follows:

manifest.json
files/
  index.js
  index.js.map

The manifest describes the entire contents of the bundle. Files that are not listed in the manifest are skipped. The manifest for the above bundle would look like this:

{
    "files": {
        "files/C/Users/test/myapp/main.cpp": {
            "path": "C:\\Users\\test\\myapp\\main.cpp",
            "type": "source"
        },
        "files/usr/local/include/somelib/somelib.h": {
            "path": "/usr/local/include/somelib/somelib.h",
            "type": "source",
        }
    },
    // ... arbitrary other attributes
}

Path Canonicalization

Paths in the source bundle are stored in a canonical way. This makes it easier to match them from different platforms. This functionality is implemented in symbolic_common::path::clean_path. Roughly, it performs the following actions:

  • Replace backslashes with forward slashes
  • Remove . path components
  • Pop the parent directory for .. path components.
  • Align to the root directory (e.g. drive letter, leading slash)
  • Remove double slashes

For instance, the path C:\Users\..\..\Program Files\\Hello becomes C:\Program Files\Hello.

TODO

  • Speed up file extraction. At the moment, all line records need to be walked. Instead, there could be something like DebugSession::file_names() -> Iterator<&str> that just emits all source file names.
  • Read source contents from the llvm DWARF extension?
  • Read source contents from PDBs?

@jan-auer jan-auer changed the title feat(sources): Initialize a sources module feat(sources): Source bundles for debug files Jun 24, 2019
jan-auer and others added 27 commits June 26, 2019 11:25
The ObjectLike trait should only contain methods that generally make
sense for all types of object files. The debug file name is a special
construct of PE files, and therefore only resides on PeObject. The same
is true for `BreakpadObject::name`, for instance.
* master:
  ref(unreal): Update unreal parsing (#152)
  fix(minidump): Do not emit default CFI for the .ra register (#157)
@jan-auer jan-auer marked this pull request as ready for review July 10, 2019 09:33
@jan-auer jan-auer merged commit 49839b2 into master Jul 10, 2019
@jan-auer jan-auer deleted the feat/sources branch July 10, 2019 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants