Skip to content
This repository has been archived by the owner on Aug 18, 2020. It is now read-only.

[CDEC-439] Cleaning ntp client #3264

Merged
merged 8 commits into from
Jul 27, 2018
Merged

[CDEC-439] Cleaning ntp client #3264

merged 8 commits into from
Jul 27, 2018

Conversation

coot
Copy link
Contributor

@coot coot commented Jul 17, 2018

Description

I cleaned the ntp client a bit.

The ntp client resolves addresses and takes first ipv4 and ipv6 addresses. It now will also try the other address in case of failure (it may happen that ipv4/6 is not supported by the os).

And I added tests:

  • serialization / deserialization of ntp packets
  • computation of clock offset
  • the coercions between ntp time and unix time

Linked issue

https://iohk.myjetbrains.com/youtrack/issue/CDEC-439

Type of change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🛠 New feature (non-breaking change which adds functionality)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🏭 Refactoring that does not change existing functionality but does improve things like code readability, structure etc
  • 🔨 New or improved tests for existing code
  • ⛑ git-flow chore (backport, hotfix, etc)

Developer checklist

  • I have read the style guide document, and my code follows the code style of this project.
  • If my code deals with exceptions, it follows the guidelines.
  • I have updated any documentation accordingly, if needed. Documentation changes can be reflected in opening a PR on cardanodocs.com, amending the inline Haddock comments, any relevant README file or one of the document listed in the docs directory.

Testing checklist

  • I have added tests to cover my changes.
  • All new and existing tests passed.

QA Steps

Screenshots (if available)

@coot coot requested review from avieth and dcoutts as code owners July 17, 2018 18:51

let poll = ntpPollDelay (ncSettings cli)

_ <- withAsync
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like you are discarding the result of a computation. Since the last statement in the computation is cancel a whose result type is IO (), the above could be written as:

    () <- withAsync

which is more obviously correct and hence easier for the reviewer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. PureScript has a class for types which can be ignored without a warning (with an instance for ()), it would be nice to have it in Haskell.

ncStatus <- newTVarIO NtpSyncPending
-- using async so the ntp thread will be left running even if the parent
-- thread finished.
_ <- liftIO $ async (spawnNtpClient ntpSettings ncStatus)
Copy link
Member

@erikd erikd Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many times is this function likely to be called? Does spawnNtpClient ever terminate?

If it never terminates, how do we make sure we don't get hundreds, thousands or even more threads?

If if just runs, does it job and then terminates, it would be nice to have a comment to that effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just runs once, a comment should be in place indeed.

(<>) <$> (Option <$> createAndBindSock IPv4 addrs)
<*> (Option <$> createAndBindSock IPv6 addrs)

handlerE :: SomeException -> IO (Option Sockets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using Control.Exception.Safe in this module so SomeException does not include async exxecptions like stack and heap overflow, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a bit inconsistent though: Ntp.Client is using safe-exceptions while Ntp.Util is using Contorl.Exception (but it's never catching async exceptions, only IOException and one custom exception). I think we should just catch IOExceptions in Ntp.Client as well rather than SomeException.

@coot
Copy link
Contributor Author

coot commented Jul 18, 2018

@erikd I updated with your comments with two additions:

  • now whole Ntp is using Control.Exception catching IOException, instead of mixing safe-exceptions with Control.Exception.
  • I moved NtpConfiguration from infra to networking

@coot coot force-pushed the coot/cdec-439 branch 2 times, most recently from 5bb9194 to 8e17a6f Compare July 19, 2018 12:03
= EBFirst !a
| EBSecond !b
| EBBoth !a !b
deriving (Show, Eq, Ord)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aka the these datatype, though this is strict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the strictness doesn't matter that much, since we are just holding max three values, so if you prefer we can pull these as a dependency. these hasn't been yet used anywhere in cardano-sl but it's a small dependency so I don't mind.

{-# LANGUAGE TypeApplications #-}

module Test.NtpSpec
( spec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

let offset = clockOffsetPure npoNtpPacket npoDestinationTime
in npoOffset === offset
describe "realMcsToNtp" $ do
it "should be right inverse of ntpToRealMcs" $ property $ \(NtpMicrosecond x) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hspec has a shorthand for this; it msg (property f) can be prop msg f

x === uncurry ntpToRealMcs (realMcsToNtp x)
it "should be left inverse of ntpToRealMcs" $ property $ \(NtpTime x@(sec, frac)) ->
let (sec', frac') = realMcsToNtp (uncurry ntpToRealMcs x)
in sec === sec' .&&. frac `div` 4310 === frac' `div` 4310
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4310 is a bit of a magic number here. What's the significance? Could it be named?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is 1000000 div 232, in the ntp spec time is represented in seconds + a fraction of a second which resolved to 232 picoseconds, so there are exactly 4310 of them in a milisecond.

Copy link
Contributor

@avieth avieth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will give it another pass later. It's a lot to take in!

@@ -66,7 +75,7 @@ data NtpClientSettings = NtpClientSettings
data NtpClient = NtpClient
{ ncSockets :: TVar Sockets
-- ^ Ntp client sockets: ipv4 / ipv6 / both.
, ncState :: TVar (Maybe [NtpOffset])
, ncState :: TVar [NtpOffset]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there previously a semantic distinction between Nothing and Just []?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was, but it wasn't necessary. It was used to not accept request which came late. Every packet comes with information when it was sent so we can discard it based on that information. This is now done in clockOffset function

(do
-- wait for responses and update status
_ <- timeout respTimeout waitForResponses
updateStatus cli
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this part be done purely? Instead of having waitForResponses mutate some TVar, have it return the responses directly, then make updateStatus take it as an argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I was actually thinking about doing this too. I haven't done it because we need ncState :: NtpClient -> TVar [NtpOffset] variable to communicate between the listener thread and sending thread. But we can change how the client is run: if we start the listener thread in this place we will not need this tvar. The communication will be done just using ncStatus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to listen on two sockets, so I think there's no way around impure ncState. Maybe you have some other idea?

Copy link
Contributor

@avieth avieth Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having a receiving thread (startReceive) is the wrong way around. Do we ever expect to get a message without first sending one? That's to say, are we acting as an ntp server as well as a client? If not, maybe we should recvFrom the socket on the same thread after sending on it. That would give a simpler implementation, with less concurrency and fewer mutable cells.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but I haven't changed that. I just didn't want to rewrite everything. I will add a todo, and let's move on.

handleIOException addressFamily e = do
logDebug $ sformat ("startReceive failed with reason: "%shown) e
threadDelay 5000000
udpLocalAddresses >>= createAndBindSock addressFamily >>= \case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we somehow reuse/consolidate mkSockets from below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this a second thought: in this place we are just recreating the socket that failed, while mkSockets creates both of them, i.e. IPv4 and IPv6; so I don't think we should reuse it here.


instance Binary NtpPacket where
put NtpPacket{..} = do
putWord8 ntpParams
putWord8 0
putInt8 ntpPoll
putWord8 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change to the binary format? Is this some standardised NTP format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I added bits just so we have a test that put and get are inverse of each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added quickcheck test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They weren't inverse prior to this change?
So I'm guessing the replicateM_ 4 $ putWord8 0 was there beecause ntpPoll was assumed to be 0. Here there's a putInt8 rather than putWord8. Same encoding?
Then there's the replicateM_ 9 $ PutWord32be 0, which is replaced by replicateM_ 5 $ putWord32be 0 but where are the remaining 4? ntpOriginTime ^.. each and ntpReceivedTime ^.. each give 2 Word32s each? Lens combinators are annoying hard to search for.

coot added 2 commits July 27, 2018 10:13
* use `These` from `these` package
* `prop` in tests
* comment why `4310` appears in the test
Copy link
Contributor

@avieth avieth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some final requests. If you know that everything is still working then LGTM.


instance Binary NtpPacket where
put NtpPacket{..} = do
putWord8 ntpParams
putWord8 0
putInt8 ntpPoll
putWord8 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They weren't inverse prior to this change?
So I'm guessing the replicateM_ 4 $ putWord8 0 was there beecause ntpPoll was assumed to be 0. Here there's a putInt8 rather than putWord8. Same encoding?
Then there's the replicateM_ 9 $ PutWord32be 0, which is replaced by replicateM_ 5 $ putWord32be 0 but where are the remaining 4? ntpOriginTime ^.. each and ntpReceivedTime ^.. each give 2 Word32s each? Lens combinators are annoying hard to search for.

let -- microseconds
secMicro :: Integer
secMicro = (fromIntegral sec - ntpTimestampDelta) * 1000000
-- Each fraction resolves to 232 pico seconds (`232 ≈ 10000000/4294`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an extra 0 in the numerator.

The explanation could be better. We divide 1 second into 2 ^ 32 parts, giving 2.3283064365386963e-10 as the quantum. A picosecond is 10e-12 of a second, so this is 232 picoseconds.

Copy link
Contributor Author

@coot coot Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't. Though previously Int was used as type for poll in NtpPacket we put 0 :: Word8 when encoding it (because it does not matter). I changed Int to Int8, the spec sais that:

Some time values are represented in exponent format, including the
precision, time constant, and poll interval. These are in 8-bit
signed integer format in log2 (log base 2) seconds.

So I Int8 should be used. Before it was Int. We ignore this field, but let's follow the spec.

But put and get weren't inverses because when we were encoding a packet we always put 0 in the origin and received time. This also does not matter for us, since we're just a client, but why not follow the spec if it does not cost much.

Indeed each the result of realMcsToNtp return a pair of Word32 (seconds since 1 Jan 1900 and fractions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I put this comment under wrong thread.

@coot coot merged commit b4e6c5d into develop Jul 27, 2018
@coot coot deleted the coot/cdec-439 branch July 30, 2018 08:51
KtorZ pushed a commit that referenced this pull request Nov 9, 2018
KtorZ pushed a commit to input-output-hk/cardano-wallet-legacy that referenced this pull request Nov 9, 2018
…hk/coot/cdec-439

[CDEC-439] Cleaning ntp client
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants