-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/issue 56 #57
Fix/issue 56 #57
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@mmolari Just a few thoughts (perhaps you've already considered some of these): The It is also customary to expose random seeds as CLI arguments (or as function parameters, in libraries). This way you don't need to invent any internal magic, such as deducing seeds from input data or from current weather, and users requiring deterministic, reproducible results could simply provide the same value through CLI (e.g. From my experience, in order to achieve reproducible results, it is essential for multi-threaded applications (where thread scheduling is determined by the operating system and is outside of control of the application) to either ensure consistent ordering when joining results (by e.g. sorting them according to the order in the original inputs) or also by allowing users to opt-out from multi-threading (with something like If you opt for sorting or similar technique, it sometimes also makes sense to allow the user to toggle the sorting on and off, with something like In general, offloading decisions to the user often makes code simpler and reduces the feeling of magic when using a piece of software, let alone that it reduces amount of work for you :). Also, advanced users often appreciate more control. Although this is not without downsides - CLI args is a public interface to maintain forever, and when there's many flags they may make the interface more difficult to use for beginners (or to use it correctly). So sane defaults are very important. |
Thanks @ivan-aksamentov! All of these suggestions are much appreciated. Concerning the flag, I originally chose For the random seeding, I was thinking of setting always the same random seed, and not give the user the option to control it, since the only thing that this seed will control is the random name of blocks. It should not impact the graph structure in any other way. And I do the seeding in a way that is robust to parallelization. Irrespective of number of threads and scheduling of operations, the results are always the same and saved in the same order. For these reasons I was thinking of setting a standard random seed in the code and not exposing any interface to change it. But do you think it's better to still give the user the option? |
This pull request:
|
Fixes issue 56, due to an edge-case in the block-merging procedure. It also:
-v
flag. If activated consistency checks are performed at each merger, and it is verified that all of the input sequences can be exactly reconstructed from the graph.