Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Db stuffz #3

Merged
merged 2 commits into from
Sep 21, 2014
Merged

Db stuffz #3

merged 2 commits into from
Sep 21, 2014

Conversation

stefanpenner
Copy link
Contributor

No description provided.

@@ -57,6 +57,7 @@ impl App {
}

pub fn db_setup(&self) {
print!(" - setting up the db\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is normally idiomatically written as:

println!(" - setting up the db");

alexcrichton added a commit that referenced this pull request Sep 21, 2014
@alexcrichton alexcrichton merged commit cb135a3 into rust-lang:master Sep 21, 2014
@tarcieri tarcieri mentioned this pull request Dec 12, 2015
someguynamedmatt added a commit to someguynamedmatt/crates.io that referenced this pull request Feb 22, 2018
# This is the 1st commit message:

/crates page, better mobile styling

# The commit message rust-lang#2 will be skipped:

# changed data-test attr for class

# The commit message rust-lang#3 will be skipped:

# bottom positioning of header element on crates page
someguynamedmatt added a commit to someguynamedmatt/crates.io that referenced this pull request Apr 2, 2018
# This is the 1st commit message:

/crates page, better mobile styling

# The commit message rust-lang#2 will be skipped:

# changed data-test attr for class

# The commit message rust-lang#3 will be skipped:

# bottom positioning of header element on crates page
sgrif added a commit to sgrif/crates.io that referenced this pull request Mar 14, 2019
Our database spends more of its time processing /api/v1/crates with no
parameters other than pagination. This query is the main one hit by
crawlers, and it is taking over 100ms to run, so it's at the top of our
list (for posterity's sake, rust-lang#2 is copying `crate_downloads` during
backups, rust-lang#3 and rust-lang#4 are the updates run from bin/update-downloads, and rust-lang#5
is the query run from the download endpoint)

The query is having to perform the full join between crates and
recent_downloads, and then count the results of that. Since we have no
search parameters of any kind, this count is equivalent to just counting
the crates table, which we can do much more quickly. We still need to do
the count over the whole thing if there's any where clause, but we can
optimize the case where there's no search.

This implicitly relies on the fact that we're only changing the select
clause in branches where we're also setting a where clause. Diesel 2
will probably have a feature that lets us avoid this. We could also
refactor the "exact match" check to be client side instead of the DB and
get rid of all the cases where we modify the select clause.

Before:

```
 Limit  (cost=427.87..470.65 rows=100 width=877) (actual time=109.698..109.739 rows=100 loops=1)
   ->  WindowAgg  (cost=0.14..10119.91 rows=23659 width=877) (actual time=109.277..109.697 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..85.429 rows=23659 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.037..34.975 rows=23659 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=23659)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.307 ms
 Execution time: 111.840 ms
```

After:

```
 Limit  (cost=1052.34..1094.76 rows=100 width=877) (actual time=11.536..12.026 rows=100 loops=1)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=627.96..627.96 rows=1 width=8) (actual time=4.966..4.966 rows=1 loops=1)
           ->  Index Only Scan using packages_pkey on crates crates_1  (cost=0.06..616.13 rows=23659 width=0) (actual time=0.015..3.513 rows=23659 loops=1)
                 Heap Fetches: 811
   ->  Subquery Scan on t  (cost=0.14..10037.11 rows=23659 width=877) (actual time=5.019..11.968 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..6.831 rows=1100 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.038..3.331 rows=1100 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=1100)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.377 ms
 Execution time: 12.106 ms
```
bors added a commit that referenced this pull request Mar 20, 2019
Optimize our most time consuming query

Our database spends more of its time processing /api/v1/crates with no
parameters other than pagination. This query is the main one hit by
crawlers, and it is taking over 100ms to run, so it's at the top of our
list (for posterity's sake, #2 is copying `crate_downloads` during
backups, #3 and #4 are the updates run from bin/update-downloads, and #5
is the query run from the download endpoint)

The query is having to perform the full join between crates and
recent_downloads, and then count the results of that. Since we have no
search parameters of any kind, this count is equivalent to just counting
the crates table, which we can do much more quickly. We still need to do
the count over the whole thing if there's any where clause, but we can
optimize the case where there's no search.

This implicitly relies on the fact that we're only changing the select
clause in branches where we're also setting a where clause. Diesel 2
will probably have a feature that lets us avoid this. We could also
refactor the "exact match" check to be client side instead of the DB and
get rid of all the cases where we modify the select clause.

Before:

```
 Limit  (cost=427.87..470.65 rows=100 width=877) (actual time=109.698..109.739 rows=100 loops=1)
   ->  WindowAgg  (cost=0.14..10119.91 rows=23659 width=877) (actual time=109.277..109.697 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..85.429 rows=23659 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.037..34.975 rows=23659 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=23659)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.307 ms
 Execution time: 111.840 ms
```

After:

```
 Limit  (cost=1052.34..1094.76 rows=100 width=877) (actual time=11.536..12.026 rows=100 loops=1)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=627.96..627.96 rows=1 width=8) (actual time=4.966..4.966 rows=1 loops=1)
           ->  Index Only Scan using packages_pkey on crates crates_1  (cost=0.06..616.13 rows=23659 width=0) (actual time=0.015..3.513 rows=23659 loops=1)
                 Heap Fetches: 811
   ->  Subquery Scan on t  (cost=0.14..10037.11 rows=23659 width=877) (actual time=5.019..11.968 rows=1100 loops=1)
         ->  Nested Loop Left Join  (cost=0.14..9966.13 rows=23659 width=869) (actual time=0.051..6.831 rows=1100 loops=1)
               ->  Index Scan using index_crates_name_ordering on crates  (cost=0.08..7604.30 rows=23659 width=860) (actual time=0.038..3.331 rows=1100 loops=1)
               ->  Index Scan using recent_crate_downloads_crate_id on recent_crate_downloads  (cost=0.06..0.10 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=1100)
                     Index Cond: (crate_id = crates.id)
 Planning time: 1.377 ms
 Execution time: 12.106 ms
```
Turbo87 pushed a commit that referenced this pull request Dec 8, 2022
Switch from `futures-cpupool` to `tokio-threadpool`
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Replace HashMap<&'static str, Box<Any>> with TypeMap.
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Update dependencies and version
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Add dyn keyword to address warnings
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Bump dependencies for new alpha release
Turbo87 pushed a commit that referenced this pull request Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants