This repository has been archived by the owner on Aug 30, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 28
XP-227 - Merge v0.3.0 changes into master
#233
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* XP-255 change setup author to XAIN AG & remove codeowners for now * XP-255 black reformat
* XP-265 remove benchmark directory * XP-265 remove datasets directory * XP-265 remove references to either examples or benchmark directory * XP-265 remove benchmark participant * XP-265 remove benchmark config relations * XP-265 remove benchmark coordinator, related types and helpers * XP-265 review comments + changed type import * XP-265 commenting out benchmarks docs * XP-243 remove flags from xain-fl codebase * XP-255 update codeowners and authors in setup (#195) * XP-255 change setup author to XAIN AG & remove codeowners for now * XP-255 black reformat * XP-265 remove benchmark directory * XP-265 remove datasets directory * XP-265 remove references to either examples or benchmark directory * XP-265 remove benchmark participant * XP-265 remove benchmark config relations * XP-265 remove benchmark coordinator, related types and helpers * XP-265 review comments + changed type import * XP-265 commenting out benchmarks docs * XP-243 remove flags from xain-fl codebase * XP-265 add ticket number for deciding about EvoAgg
* XP-261 move tests to tests dir * XP-261 add todo's to pytest.ini * XP-261 fix pytest.ini after rebase
* XP-354 Remove proto files XP-354 add xain proto package * XP-354 Fix isort errors * XP-354 Remove cproto dependencies * XP-354 Update git workflow * XP-354 Remove unused import * XP-354 Remove _pb2 glob * XP-354 Remove protobuf in dockerfiles
* XP-273: remove scripts/test_slow.sh This script is basically a one-liner (`pytest --run-slow`), and is effectively a noop since there is no test marked as slow. * XP-273: remove scripts/setup{_pyenv}.sh setup_pyenv.sh suffers multiple issues that justify its removal: - it is very intrusive: it downloads software in the user home directory, edits their zshrc/bashrc and installs software using the system package manager which requires root permissions - it can break the user configuration if they already have a pyenv installation - it only works in very specific cases: if the user uses a Debian based distro with bash as a default shell or MacOS with zsh setup.sh is not particularly useful because most python developers will do `pip install -e .[dev]` by themself. * XP-273: script/format.sh: print the commands The commands in this script can make changes to the repo, so users may appreciate seeing the actual commands being run. * XP-273: make scripts/rm_cache.sh less aggressive - print the `rm` commands being run so that users know what's being deleted - do not delete the `__pycache` directories so aggressively. For instance I usually create a virtualenv in each projects I'm working on and don't want to have a script touching it. * XP-273: remove scripts/ls_running_instances.sh This script requires private credentials to run and is specific to our infrastructure so it does not make sense to keep it public.
* XP-357 make controller parametrisable * XP-357 add TODO on valid attribute values * XP-357 minor spelling corrections * XP-357 simplify Controller.get_num_ids_to_select()
* XP-384 remove unused files * XP-384 update CI jobs
* XP-374 Clean up docs * XP-374 Add version number
* XP-271 fix pylint issues * XP-271 fix pylint issues * XP-271 fix isort * XP-271 disable fixme warning * XP-271 add pytest protobuf env * XP-271 align scripts with ci settings * XP-271 add docstring todos * XP-271 update sphinx config
* XP-424 Remove unused packages * XP-424 Move typing-extensions to dev
* XP-373 reactivate tests by removing comments * XP-373 fix test_end_training * XP-373 move dependency to tests_require
* XP-433 Fix docker headings * XP-433 Replace bold with h4
* XP-119 Close channel after each test * XP-119 Remove unused import * XP-119 Remove TODO
* XP-422 update coordinator metric handling * XP-422 update xain-proto import * XP-422 update sdk import * XP-422 correct test mocking destinations
* XP-308: store aggregated weights in S3 buckets Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 Summary: This commits adds a new module `xain-fl.coordinator.store`, which contains a `Store()` class that can read from and write to S3 buckets. At the end of each round, the aggregated weights are uploaded to an S3 bucket. Note that as a first step, the aggregated weights are still sent to the participants via gRPC. The versioning is straightforward: the key for storing each set of weights is the round number corresponding to the weights. Implementation notes: We're trying to not have IO related code in the Coordinator class, so the Store instance is held by the CoordinatorGrpc instead of Coordinator. However, since the Coordinator is currently generating the grpc messages, which are inherently coupled to the IO logic. This means that if at some point the messages need to contain information related to storage, we'll end up with storage logic in the Coordinator, and things might become pretty messy. Thus, we should already start thinking about refactoring xain-fl.coordinator such that the state machine is completely decoupled for the IO logic. * XP-308: run isort, black and mypy on the tests package This is not directly related to task but was done as part of writing the tests for the storage logic. Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 * XP-308: Add unit test doing an entire round Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 Summary: This is the first test where multiple participants connect and do a round of training. Implementation notes: - Because we use the peer address to identify a participant, we have to use a port forwarding trick to make distinct connections to the coordinator. This is implemented in the `tests.port_forwarding` module - To test the store implementation, we use a custom TestStore class that writes data in memory instead of using an actual S3 bucket. That way our tests don't depend on an external service. This is implemented in the `tests.store` module. * XP-308: docker: install git and make pip verbose Git is needed to fetch package from git repositories (xain-proto for instance). This commit also makes `pip install` verbose because the command can take a very long time to execute when gRPC and protobuf are compiled, and without any output, it may seem like the command hangs. * XP-308: don't install dev dependencies in docker The docutils is a transitive dev dependency that is causing problem. Specifically, when installing dev dependencies in docker, `docutils==0.16` gets installed, but `boto3` is incompatible with it and `xain-fl` crashes at runtime: ``` Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 583, in _build_master ws.require(__requires__) File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 900, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 791, in resolve raise VersionConflict(dist, req).with_context(dependent_req) pkg_resources.ContextualVersionConflict: (docutils 0.16 (/usr/local/lib/python3.7/site-packages), Requirement.parse('docutils<0.16,>=0.10'), {'botocore'}) ``` I don't know why pip pulls this version of docutils though, since we don't have any dependency that requires such a high version number: ``` $ pipdeptree -r -p docutils docutils==0.16 - botocore==1.13.50 [requires: docutils>=0.10,<0.16] - boto3==1.10.48 [requires: botocore>=1.13.48,<1.14.0] - xain-fl==0.2.0 [requires: boto3==1.10.48] - s3transfer==0.2.1 [requires: botocore>=1.12.36,<2.0.0] - boto3==1.10.48 [requires: s3transfer>=0.2.0,<0.3.0] - xain-fl==0.2.0 [requires: boto3==1.10.48] - m2r==0.2.1 [requires: docutils] - readme-renderer==24.0 [requires: docutils>=0.13.1] - twine==2.0.0 [requires: readme-renderer>=21.0] - Sphinx==2.2.1 [requires: docutils>=0.12] ``` * XP-308: print debugging information for spurious failure
* XP-208 model sum aggregator as a mock for testing * XP-208 add test integrating P and C * XP-208 add identity controller as a mock for test * XP-208 small typo in id-controller docstring * XP-208 add coordinator test fixture with mocked components * XP-208 test for top-level function start_participant * XP-436 fix bug: coordinator not advertising FINISHED state * XP-436 revert logging level * XP-308: store aggregated weights in S3 buckets (#215) * XP-308: store aggregated weights in S3 buckets Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 Summary: This commits adds a new module `xain-fl.coordinator.store`, which contains a `Store()` class that can read from and write to S3 buckets. At the end of each round, the aggregated weights are uploaded to an S3 bucket. Note that as a first step, the aggregated weights are still sent to the participants via gRPC. The versioning is straightforward: the key for storing each set of weights is the round number corresponding to the weights. Implementation notes: We're trying to not have IO related code in the Coordinator class, so the Store instance is held by the CoordinatorGrpc instead of Coordinator. However, since the Coordinator is currently generating the grpc messages, which are inherently coupled to the IO logic. This means that if at some point the messages need to contain information related to storage, we'll end up with storage logic in the Coordinator, and things might become pretty messy. Thus, we should already start thinking about refactoring xain-fl.coordinator such that the state machine is completely decoupled for the IO logic. * XP-308: run isort, black and mypy on the tests package This is not directly related to task but was done as part of writing the tests for the storage logic. Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 * XP-308: Add unit test doing an entire round Refs: https://xainag.atlassian.net/browse/XP-40 https://xainag.atlassian.net/browse/XP-308 https://xainag.atlassian.net/browse/XP-309 https://xainag.atlassian.net/browse/XP-311 Summary: This is the first test where multiple participants connect and do a round of training. Implementation notes: - Because we use the peer address to identify a participant, we have to use a port forwarding trick to make distinct connections to the coordinator. This is implemented in the `tests.port_forwarding` module - To test the store implementation, we use a custom TestStore class that writes data in memory instead of using an actual S3 bucket. That way our tests don't depend on an external service. This is implemented in the `tests.store` module. * XP-308: docker: install git and make pip verbose Git is needed to fetch package from git repositories (xain-proto for instance). This commit also makes `pip install` verbose because the command can take a very long time to execute when gRPC and protobuf are compiled, and without any output, it may seem like the command hangs. * XP-308: don't install dev dependencies in docker The docutils is a transitive dev dependency that is causing problem. Specifically, when installing dev dependencies in docker, `docutils==0.16` gets installed, but `boto3` is incompatible with it and `xain-fl` crashes at runtime: ``` Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 583, in _build_master ws.require(__requires__) File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 900, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 791, in resolve raise VersionConflict(dist, req).with_context(dependent_req) pkg_resources.ContextualVersionConflict: (docutils 0.16 (/usr/local/lib/python3.7/site-packages), Requirement.parse('docutils<0.16,>=0.10'), {'botocore'}) ``` I don't know why pip pulls this version of docutils though, since we don't have any dependency that requires such a high version number: ``` $ pipdeptree -r -p docutils docutils==0.16 - botocore==1.13.50 [requires: docutils>=0.10,<0.16] - boto3==1.10.48 [requires: botocore>=1.13.48,<1.14.0] - xain-fl==0.2.0 [requires: boto3==1.10.48] - s3transfer==0.2.1 [requires: botocore>=1.12.36,<2.0.0] - boto3==1.10.48 [requires: s3transfer>=0.2.0,<0.3.0] - xain-fl==0.2.0 [requires: boto3==1.10.48] - m2r==0.2.1 [requires: docutils] - readme-renderer==24.0 [requires: docutils>=0.13.1] - twine==2.0.0 [requires: readme-renderer>=21.0] - Sphinx==2.2.1 [requires: docutils>=0.12] ``` * XP-308: print debugging information for spurious failure * XP-208 join monitor_thread to confirm response to terminate_event * XP-208 change mock patch string to object called rather than defined * XP-208 model sum aggregator as a mock for testing * XP-208 add test integrating P and C * XP-208 add identity controller as a mock for test * XP-208 small typo in id-controller docstring * XP-208 add coordinator test fixture with mocked components * XP-208 test for top-level function start_participant * XP-436 fix bug: coordinator not advertising FINISHED state * XP-436 revert logging level * XP-208 join monitor_thread to confirm response to terminate_event * XP-208 change mock patch string to object called rather than defined * XP-436 suggested documentation fixes from review * XP-208 mock the store (following rebase) * XP-208 fix import * XP-208 test_start_participant: mark as slow, comment on aggregation Co-authored-by: Corentin Henry <little-dude@users.noreply.github.com>
* XP-480 adjust variable names and imports * XP-480 temporarly fix isort * XP-480 update install req * XP-480 enable isort * XP-408 update isort config * XP-480 update sequence diagram * XP-480 update proto import
* XP-333 Replace numproto with xain-proto * XP-333 Format setup.py * XP-333 Update xain-proto * XP-333 Update docs
* XP-505: use sphinx-autodoc-typehints Ref: https://xainag.atlassian.net/browse/XP-505 This sphinx extension will allow us to simplify docstrings by not having to explicitely declaire each type, since types can be infered from the type hints in the function signatures. It also automatically generate working links, many of which are currently broken. * XP-505: clean up xain_fl.coordinator.coordinator doc Ref: https://xainag.atlassian.net/browse/XP-505
* XP-508 Replace circleci badge with github workflow * XP-508 Add link * XP-508 Fix link * XP-508 Fix styles * XP-508 Escape parentheses
On some platforms, /bin/bash does not exist. Using /usr/bin/env let users choose which `bash` should be used to run the scripts. References: - https://xainag.atlassian.net/browse/XP-498 - https://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my
* XP-505: cleanup docstrings in xain_fl.coordinator This is a follow-up of #224 Reference: https://xainag.atlassian.net/browse/XP-505 * XP-505: fix links to grpc documentation Cc: Robert Steiner <robertt.debug@gmail.com>
…-proto`) (#230) * XP-227 Prepare release of v0.3.0 (matching version with `xain-fl` and `xain-sdk`) * XP-227 Update setup.py to point to released xain-proto=0.3.0
Robert-Steiner
approved these changes
Jan 23, 2020
rsaffi
changed the title
XP-277 - Merge v0.3.0 changes into
XP-227 - Merge v0.3.0 changes into Jan 23, 2020
master
master
dkravetz
approved these changes
Jan 23, 2020
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
References
XP-227 and XP-356
Summary
For keeping
master
consistent with what was released onv0.3.0
, we're merging intomaster
from the commit ondevelopment
where the release was done from.Are there any open tasks/blockers for the ticket?
No
Reviewer checklist
Reviewer agreement:
Merge request checklist
XP-XXX <a description in imperative form>
.XP-XXX <a description in imperative form>
.Code checklist
XP-XXX-<a_small_stub>
.