Enable recovery from zero capacity #129

djones6 · 2018-04-18T13:40:22Z

Description

Successor to #128

Allows the ConnectionPool to try to grow by one connection in the scenario where it is at zero capacity, and prevents threads from blocking (either indefinitely, or with a timeout) in the case where the database is unreachable.

Motivation and Context

When connectivity to a database is lost, the ConnectionPool logic introduced in #127 tries to ensure that dead connections are not returned to the user, and we throw that connection away and respond with a new one instead (which will rejoin the pool as normal when the user is finished).

However, if the database is down, the connection generator could fail, leaving the ConnectionPool with no option other than to return nil. This shrinks the pool, as there are now one fewer connections than before, and eventually the pool will be at zero capacity (this is not the same as 'empty', because there is nothing in it: at zero capacity, not only is it empty, there are no 'good' connections currently on loan which will eventually return).

Once we hit zero capacity, the existing code has no way to grow the pool (since this was done during the 'take a connection from the pool' stage).

Other considerations

This PR does not introduce any new API to make it easier for a user to determine whether they should keep trying, or give up (ie. there is no isDatabaseReachable API). This might be a useful addition in the future.

This PR also does not address the inconsistency between Postgres and MySQL where a Postgres connection is only flagged as disconnected once a query has failed to execute. I added some thoughts on this subject to Kitura/Swift-Kuery-PostgreSQL#49.

Testing

I have tested this on Linux, using a Kuery-Postgres based benchmark, both stepping through the 'DB is down' scenario by locally starting/stopping Postgres, and running load against it (and observing it recover as I stop/start the database).

As an aside, I had to slightly modify this code to make the generator return a nil connection (ie. fail) if connect() fails - at the moment it logs the error but returns the bad connection anyway. While I did this, it wasn't clear to me how I could do this properly if the connect() callback was called asynchronously (it's currently synchronous, so I could cheat).

codecov-io · 2018-04-18T13:44:08Z

Codecov Report

Merging #129 into master will decrease coverage by 0.63%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master     #129      +/-   ##
==========================================
- Coverage   85.19%   84.55%   -0.64%     
==========================================
  Files          43       43              
  Lines        3579     3606      +27     
==========================================
  Hits         3049     3049              
- Misses        530      557      +27

Flag	Coverage Δ
#SwiftKuery	`84.55% <0%> (-0.64%)`	⬇️

Impacted Files	Coverage Δ
Sources/SwiftKuery/ConnectionPool.swift	`0% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6c3c25...b340ecd. Read the comment docs.

DunnCoding

Discussed the prefix tryTo of the method names with you already. I'm not a fan of this it doesn't look very 'Swifty'

ianpartridge · 2018-04-18T13:48:41Z

Sources/SwiftKuery/ConnectionPool.swift

+    // still have zero capacity.
+    //
+    // Returns true iff the pool has a non-zero capacity.
+    private func tryToEnsureNonZeroCapacity() -> Bool {


Why not just take the lock up front, before checking capacity?

We need to check capacity again after we lock.

We check capacity is equal to zero, if it is we lock the pool. However a connection may have also been added to the pool in the small window between us checking the capacity and us locking the pool, which would increase the capacity and remove the need for us to grow the pool.

I discussed this with Dave and suggested adding a comment here to explain what we are doing and why, as it does look strange that we check capacity twice.

@ianpartridge You're right, this needs commenting.

We don't want to lock up-front, because locking involves a semaphore (synchronization), and this is on the critical path. In normal circumstances, the capacity is greater than zero (normal operation) so we don't want to pay a cost for locking unless we have to.

If the first check succeeds, then we (probably) have to do some work, so we lock and then check again. We have to check again, because multiple threads could simultaneously check for capacity == 0 outside the lock, and we don't want them all to generate an item.

Enable recovery from zero capacity

feb5f31

djones6 requested review from DunnCoding and ianpartridge April 18, 2018 13:40

DunnCoding mentioned this pull request Apr 18, 2018

Enable ConnectionPool to recover from zero capacity #128

Closed

DunnCoding reviewed Apr 18, 2018

View reviewed changes

ianpartridge reviewed Apr 18, 2018

View reviewed changes

Improve naming of functions

b340ecd

djones6 merged commit a95ae01 into master Apr 19, 2018

djones6 deleted the issue.zerocapacity branch April 19, 2018 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable recovery from zero capacity #129

Enable recovery from zero capacity #129

djones6 commented Apr 18, 2018 •

edited

Loading

codecov-io commented Apr 18, 2018 •

edited

Loading

DunnCoding left a comment •

edited

Loading

ianpartridge Apr 18, 2018

DunnCoding Apr 18, 2018 •

edited

Loading

djones6 Apr 19, 2018

Enable recovery from zero capacity #129

Enable recovery from zero capacity #129

Conversation

djones6 commented Apr 18, 2018 • edited Loading

Description

Motivation and Context

Other considerations

Testing

codecov-io commented Apr 18, 2018 • edited Loading

Codecov Report

DunnCoding left a comment • edited Loading

Choose a reason for hiding this comment

ianpartridge Apr 18, 2018

Choose a reason for hiding this comment

DunnCoding Apr 18, 2018 • edited Loading

Choose a reason for hiding this comment

djones6 Apr 19, 2018

Choose a reason for hiding this comment

djones6 commented Apr 18, 2018 •

edited

Loading

codecov-io commented Apr 18, 2018 •

edited

Loading

DunnCoding left a comment •

edited

Loading

DunnCoding Apr 18, 2018 •

edited

Loading