Benchmarking FL strategies on FLamby with benchopt

This benchmark is dedicated to tuning cross-silo FL strategies on Flamby's datasets. The goal is to maximize the average metric across clients using each provided model on the val/test clients:

$$\max_{\theta} \sum_{k=0}^{K} m(f_{\theta}(X_{k}), y_{k})$$

where $K$ stands for the number of clients participating in the Federated Learning training, $p$ (or n_features) stands for the number of features , $\theta$ the parameters of the model of dimension $N$, $$X \in \mathbb{R}^{n \times p} \ , \quad \theta \in \mathbb{R}^N$$ and $m$, the metric of interest. To ease comparison, we fix the number of local updates to 100 and the maximum number of rounds to 120 (12*10).

Try to beat the FLamby by adding your own solver !

You can even use your favorite python FL-frameworks such as substra or FedBioMed to build your solver !

Install

First go to Flamby and install it using the following commands (see the API Doc if needed):

$ git clone https://github.com/owkin/FLamby.git
$ cd FLamby
$ conda create -n benchmark_flamby
$ conda activate benchmark_flamby
$ pip install -e ".[all_extra]" # Note that the all_extra option installs all dependencies for all 7 datasets

This benchmark can then be run on Fed-TCGA-BRCA's validation sets using the following commands, which will launch a grid-search on all parameters found in utils/common.py for the FederatedAveraging strategy doing 120 rounds (--max-runs 12 * 10) with 100 local updates per round:

$ pip install -U benchopt
$ cd ..
$ git clone https://github.com/owkin/benchmark_flamby
$ cd benchmark_flamby
$ benchopt run --timeout 24h --max-runs 12 -s FederatedAveraging -d Fed-TCGA-BRCA

To test a specific value of hyper-parameters just fill a yaml config file with the appropriate hyper-parameters for each solver following the example_config.yml example config file.

$ benchopt run --config ./example_config.yml

Or use directly the CLI:

$ benchopt run -s FederatedAveraging[batch_size=32,learning_rate=0.031622776601683794]

For the whole benchmark on Fed-TCGA-BRCA we successively run all hyper-parameters of the grid for all strategies. To reproduce results just launch the following command (note that it takes several hours to complete but can be cached):

$ bash launch_validation_benchmarks.sh

This script should reproduce the html plot visible on the results for Fed-TCGA-BRCA and produce a config with all best validation hyper-parameters for each strategy.

To produce the final plot on the test run:

$ benchopt run --timeout 24h --config ./best_config_test_Fed-TCGA-BRCA.yml

To benchmark on other datasets of FLamby, follow FLamby's instructions to download each dataset, for example you can find Fed-Heart-Disease's download's instructions here. Then once the dataset is downloaded one can run the same commands changing the dataset argument i.e.:

For the validation:

$ bash launch_validation_benchmarks.sh Fed-Heart-Disease

For the results on the test sets:

$ benchopt run --timeout 24h --config ./best_config_found_for_heart_disease.yml

Use benchopt run -h for more details about these options, or visit https://benchopt.github.io/api.html.

FAQ

Unfortunately some of flamby dependencies still rely on old sklearn versions see sklearn doc. about ways to fix it. So one way is to set the SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL environment variable to True. On Linux do:

$ export SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True

$ ModuleNotFoundError: No module named 'flamby.whatever'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

Benchmarking FL strategies on FLamby with benchopt

Install

FAQ

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

Benchmarking FL strategies on FLamby with benchopt

Install

FAQ