SACRO-ML

An increasing body of work has shown that machine learning (ML) models may expose confidential properties of the data on which they are trained. This has resulted in a wide range of proposed attack methods with varying assumptions that exploit the model structure and/or behaviour to infer sensitive information.

The sacroml package is a collection of tools and resources for managing the statistical disclosure control (SDC) of trained ML models. In particular, it provides:

A safemodel package that extends commonly used ML models to provide ante-hoc SDC by assessing the theoretical risk posed by the training regime (such as hyperparameter, dataset, and architecture combinations) before (potentially) costly model fitting is performed. In addition, it ensures that best practice is followed with respect to privacy, e.g., using differential privacy optimisers where available. For large models and datasets, ante-hoc analysis has the potential for significant time and cost savings by helping to avoid wasting resources training models that are likely to be found to be disclosive after running intensive post-hoc analysis.
An attacks package that provides post-hoc SDC by assessing the empirical disclosure risk of a classification model through a variety of simulated attacks after training. It provides an integrated suite of attacks with a common application programming interface (API) and is designed to support the inclusion of additional state-of-the-art attacks as they become available. In addition to membership inference attacks (MIA) such as the likelihood ratio attack (LiRA) and attribute inference, the package provides novel structural attacks that report cheap-to-compute metrics, which can serve as indicators of model disclosiveness after model fitting, but before needing to run more computationally expensive MIAs.
Summaries of the results are written in a simple human-readable report.

Installation

Python Package Index

$ pip install sacroml

Note: macOS users may need to install libomp due to a dependency on XGBoost:

$ brew install libomp

Conda

$ conda install sacroml

Usage

Quick-start example:

from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from sacroml.attacks.likelihood_attack import LIRAAttack
from sacroml.attacks.target import Target

# Load dataset
X, y = load_breast_cancer(return_X_y=True, as_frame=False)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Fit model
model = RandomForestClassifier(min_samples_split=2, min_samples_leaf=1)
model.fit(X_train, y_train)

# Wrap model and data
target = Target(
    model=model,
    dataset_name="breast cancer",
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test,
)

# Create an attack object and run the attack
attack = LIRAAttack(n_shadow_models=100, output_dir="output_example")
attack.attack(target)

For more information, see the examples.

Documentation

See API documentation.

Contributing

See our contributing guide.

Acknowledgement

This work was supported by UK Research and Innovation as part of the Data and Analytics Research Environments UK (DARE UK) programme, delivered in partnership with Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK). The specific projects were Semi-Automated Checking of Research Outputs (SACRO; MC_PC_23006), Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER; MC_PC_21033), and TREvolution (MC_PC_24038). This project has also been supported by MRC and EPSRC (PICTURES; MR/S010351/1).

Name		Name	Last commit message	Last commit date
Latest commit History 1,711 Commits
.github		.github
docs/source		docs/source
examples		examples
sacroml		sacroml
tests		tests
.codecov.yml		.codecov.yml
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SACRO-ML

Installation

Python Package Index

Conda

Usage

Documentation

Contributing

Acknowledgement

About

Uh oh!

Releases 17

Packages

Uh oh!

Contributors 12

Uh oh!

Languages

License

AI-SDC/SACRO-ML

Folders and files

Latest commit

History

Repository files navigation

SACRO-ML

Installation

Python Package Index

Conda

Usage

Documentation

Contributing

Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Contributors 12

Uh oh!

Languages

Packages