|
| 1 | +--- |
| 2 | +title: ".cargo directories explained" |
| 3 | +date: 2023-01-24 |
| 4 | +author: Daniel Bevenius |
| 5 | +--- |
| 6 | + |
| 7 | +This post takes a closer look at the `.cargo` directory with a focus on the |
| 8 | +`git`, and `registry` directories. |
| 9 | + |
| 10 | +## .cargo/git directory |
| 11 | +If we do a listing of this directory we will find two subdirectories, namely |
| 12 | +`db`, and `checkouts`. |
| 13 | + |
| 14 | +If we list the contents of one of those directories we will see that there is a |
| 15 | +hash appended to every crate. For example: |
| 16 | +``` |
| 17 | +~/.cargo/git/db/sigstore-rs-874f7064c0c10336/ |
| 18 | +``` |
| 19 | +This is a hash of the url of the git repository. To verify this there is a |
| 20 | +[command line tool](https://github.com/trustification/source-distributed#print-git-project-hash) |
| 21 | +that can be used: |
| 22 | +```console |
| 23 | +$ cargo r --quiet --bin project-hash -- -u https://github.com/sigstore/sigstore-rs.git |
| 24 | +https://github.com/sigstore/sigstore-rs.git: 874f7064c0c10336 |
| 25 | +``` |
| 26 | +And we can check this hash against the hash above. |
| 27 | + |
| 28 | +The directories in `.cargo/git/db` are the bare git repositories, and the |
| 29 | +directories in `.cargo/git/checkouts` are the checked-out revisions and they |
| 30 | +have a directory for each revision (short hash) used by Cargo. |
| 31 | + |
| 32 | +## .cargo/registry directory |
| 33 | +The local dependencies from crates.io are located in `/.cargo/registry`: |
| 34 | +```console |
| 35 | +$ ls ~/.cargo/registry/ |
| 36 | +cache CACHEDIR.TAG index src |
| 37 | +``` |
| 38 | +There can be multiple registries which are located in the index directory: |
| 39 | +``` |
| 40 | +$ ls ~/.cargo/registry/index/ |
| 41 | +github.51.al-1ecc6299db9ec823 |
| 42 | +``` |
| 43 | +Now this was a little confusing to me as I did not expect a github.com directory |
| 44 | +here. It turns out that Cargo communicates with registries through a github |
| 45 | +repository which is called the `Index`. One such github repository is |
| 46 | +https://github.com/rust-lang/crates.io-index. |
| 47 | + |
| 48 | +Lets clone this index and take a look at it: |
| 49 | +```console |
| 50 | +$ git clone https://github.com/rust-lang/crates.io-index.git |
| 51 | +$ cd crates.io-index/ |
| 52 | +``` |
| 53 | +If we list the contents of this directory we will see a number of subdirectories |
| 54 | +starting with one, or two characters/symbols/numbers. Additionally there is also |
| 55 | +a `config.json` file. |
| 56 | + |
| 57 | +Now, notice that this index does not contain any crates: |
| 58 | +```console |
| 59 | +$ find . -name '*.crate' | wc -l |
| 60 | +0 |
| 61 | +``` |
| 62 | +Instead what the index stores is a list of versions for all known packages. Each |
| 63 | +crate will have a single file and there will be an entry in this file for each |
| 64 | +version. |
| 65 | + |
| 66 | +Lets take a look at the `drg` crate: |
| 67 | +```console |
| 68 | +$ cat 3/d/drg |
| 69 | +{"name":"drg","vers":"0.1.0","deps":[],"cksum":"c6bfa8b0b1bcd485d5f783e77faf13ba9453e7ab78991936e50d6cfdca23d647","features":{},"yanked":true} |
| 70 | +{"name":"drg","vers":"0.2.1","deps":[{"name":"anyhow","req":"^1.0","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"chrono","req":"^0.4","features":["serde"],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"clap","req":"^2.33.3","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"oauth2","req":"^3.0","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"qstring","req":"^0.7.2","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"reqwest","req":"^0.11","features":["blocking","json"],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"serde","req":"^1.0","features":["derive"],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"serde_json","req":"^1.0","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"strum","req":"^0.20","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"strum_macros","req":"^0.20","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"tempfile","req":"^3.2.0","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"tiny_http","req":"^0.8.0","features":[],"optional":false,"default_features":true,"target":null,"kind":"normal"},{"name":"url","req":"^2.2.1","features":["serde"],"optional":false,"default_features":true,"target":null,"kind":"normal"}],"cksum":"cfb067bfabd64c3b4732a3afd2b9a757a88120f6dac6400eae5b865732be0404","features":{},"yanked":false} |
| 71 | +... |
| 72 | +``` |
| 73 | +Notice that there are three directories named `1`, `2`, and `3` which are for |
| 74 | +crates that have one, two, or three letters/characters in their name. This is |
| 75 | +the case with `drg` above. |
| 76 | + |
| 77 | +For other crates with longer names, the first directory matches the first two |
| 78 | +characters of the crate, and the subdirectory under that will have another |
| 79 | +directory matching the following two characters of the crate name. |
| 80 | +For example, if we want to find the `drogue-device` crate, we would search for |
| 81 | +`dr` as the first directory, and then `og` as the subdirectory: |
| 82 | +```console |
| 83 | +$ cat ./dr/og/drogue-device | jq |
| 84 | +{ |
| 85 | + "name": "drogue-device", |
| 86 | + "vers": "0.0.0", |
| 87 | + "deps": [], |
| 88 | + "cksum": "2acc1a9827b5cd933ebef9824415789012f5202b6bcacddaae2c214486ac996a", |
| 89 | + "features": {}, |
| 90 | + "yanked": false |
| 91 | +} |
| 92 | +``` |
| 93 | +When new versions of this crate are released a new entry/line in this file will |
| 94 | +be created. |
| 95 | + |
| 96 | +Updates to the index are fairly cheap, just like a normal git fetch and a |
| 97 | +git fast forward. |
| 98 | + |
| 99 | +Alright, so we now have an effecient way to look up a crate version and its |
| 100 | +dependencies but we haven't seen any crates yet. This is where the file |
| 101 | +`config.json` comes in to play: |
| 102 | +```console |
| 103 | +$ cat config.json |
| 104 | +{ |
| 105 | + "dl": "https://crates.io/api/v1/crates", |
| 106 | + "api": "https://crates.io" |
| 107 | +} |
| 108 | +``` |
| 109 | +`dl` stands for `download` and is the url that can be used to download a |
| 110 | +specific crate to the `.cargo/registry/cache` directory. |
| 111 | + |
| 112 | +We can do this manually using the value of `dl`: |
| 113 | +```console |
| 114 | +$ curl -v -L https://crates.io/api/v1/crates/drg/0.1.0/download --output drg-0.0.1.crate |
| 115 | +``` |
| 116 | +And we should then be able to list the content of this crate: |
| 117 | +```console |
| 118 | +$ tar tvf drg-0.0.1.crate |
| 119 | +-rw-r--r-- 0/0 74 2021-03-18 15:57 drg-0.1.0/.cargo_vcs_info.json |
| 120 | +-rw-r--r-- 110147/110147 8 2021-03-18 15:55 drg-0.1.0/.gitignore |
| 121 | +-rw-r--r-- 0/0 134 2021-03-18 15:57 drg-0.1.0/Cargo.lock |
| 122 | +-rw-r--r-- 0/0 754 2021-03-18 15:57 drg-0.1.0/Cargo.toml |
| 123 | +-rw-r--r-- 110147/110147 327 2021-03-18 15:56 drg-0.1.0/Cargo.toml.orig |
| 124 | +-rw-r--r-- 110147/110147 45 2021-03-18 15:55 drg-0.1.0/src/main.rs |
| 125 | +``` |
| 126 | +Cargo will download crates to the `.cargo/registry/cache` directory which |
| 127 | +will only contain the downloaded crates, the `.crate` compressed tar files. |
| 128 | +These never change for a version so they don't have to be downloaded again. |
| 129 | + |
| 130 | +The `src` directory is where the downloaded crates in the cache directory are |
| 131 | +unpacked: |
| 132 | +```console |
| 133 | +$ ls ~/.cargo/registry/src/ |
| 134 | +github.51.al-1ecc6299db9ec823 |
| 135 | +``` |
| 136 | + |
| 137 | +The hash appended is a hash of the the identifier of the crates repository, |
| 138 | +in this case `crates.io` To verify this there is a |
| 139 | +[command line tool](https://github.com/trustification/source-distributed#print-cargo-index-hash) |
| 140 | +that can be used: |
| 141 | +```console |
| 142 | +$ cargo r --quiet --bin index-dir-hash |
| 143 | +crates-io: 1ecc6299db9ec823 |
| 144 | +``` |
| 145 | +And we can check this hash against the hash above. |
| 146 | + |
| 147 | +Hopefully this post clarifies what some of the directories under the .cargo |
| 148 | +directory are used for. |
0 commit comments