Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable recovery from zero capacity #129

Merged
merged 2 commits into from
Apr 19, 2018
Merged

Enable recovery from zero capacity #129

merged 2 commits into from
Apr 19, 2018

Conversation

djones6
Copy link
Contributor

@djones6 djones6 commented Apr 18, 2018

Description

Successor to #128

Allows the ConnectionPool to try to grow by one connection in the scenario where it is at zero capacity, and prevents threads from blocking (either indefinitely, or with a timeout) in the case where the database is unreachable.

Motivation and Context

When connectivity to a database is lost, the ConnectionPool logic introduced in #127 tries to ensure that dead connections are not returned to the user, and we throw that connection away and respond with a new one instead (which will rejoin the pool as normal when the user is finished).

However, if the database is down, the connection generator could fail, leaving the ConnectionPool with no option other than to return nil. This shrinks the pool, as there are now one fewer connections than before, and eventually the pool will be at zero capacity (this is not the same as 'empty', because there is nothing in it: at zero capacity, not only is it empty, there are no 'good' connections currently on loan which will eventually return).

Once we hit zero capacity, the existing code has no way to grow the pool (since this was done during the 'take a connection from the pool' stage).

Other considerations

This PR does not introduce any new API to make it easier for a user to determine whether they should keep trying, or give up (ie. there is no isDatabaseReachable API). This might be a useful addition in the future.

This PR also does not address the inconsistency between Postgres and MySQL where a Postgres connection is only flagged as disconnected once a query has failed to execute. I added some thoughts on this subject to Kitura/Swift-Kuery-PostgreSQL#49.

Testing

I have tested this on Linux, using a Kuery-Postgres based benchmark, both stepping through the 'DB is down' scenario by locally starting/stopping Postgres, and running load against it (and observing it recover as I stop/start the database).

As an aside, I had to slightly modify this code to make the generator return a nil connection (ie. fail) if connect() fails - at the moment it logs the error but returns the bad connection anyway. While I did this, it wasn't clear to me how I could do this properly if the connect() callback was called asynchronously (it's currently synchronous, so I could cheat).

@codecov-io
Copy link

codecov-io commented Apr 18, 2018

Codecov Report

Merging #129 into master will decrease coverage by 0.63%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #129      +/-   ##
==========================================
- Coverage   85.19%   84.55%   -0.64%     
==========================================
  Files          43       43              
  Lines        3579     3606      +27     
==========================================
  Hits         3049     3049              
- Misses        530      557      +27
Flag Coverage Δ
#SwiftKuery 84.55% <0%> (-0.64%) ⬇️
Impacted Files Coverage Δ
Sources/SwiftKuery/ConnectionPool.swift 0% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6c3c25...b340ecd. Read the comment docs.

Copy link

@DunnCoding DunnCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed the prefix tryTo of the method names with you already. I'm not a fan of this it doesn't look very 'Swifty'

// still have zero capacity.
//
// Returns true iff the pool has a non-zero capacity.
private func tryToEnsureNonZeroCapacity() -> Bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just take the lock up front, before checking capacity?

Copy link

@DunnCoding DunnCoding Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check capacity again after we lock.

We check capacity is equal to zero, if it is we lock the pool. However a connection may have also been added to the pool in the small window between us checking the capacity and us locking the pool, which would increase the capacity and remove the need for us to grow the pool.

I discussed this with Dave and suggested adding a comment here to explain what we are doing and why, as it does look strange that we check capacity twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianpartridge You're right, this needs commenting.

We don't want to lock up-front, because locking involves a semaphore (synchronization), and this is on the critical path. In normal circumstances, the capacity is greater than zero (normal operation) so we don't want to pay a cost for locking unless we have to.

If the first check succeeds, then we (probably) have to do some work, so we lock and then check again. We have to check again, because multiple threads could simultaneously check for capacity == 0 outside the lock, and we don't want them all to generate an item.

@djones6 djones6 merged commit a95ae01 into master Apr 19, 2018
@djones6 djones6 deleted the issue.zerocapacity branch April 19, 2018 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants