Analysing Physician-Patient Referral Network Topology
Apparent is a Python toolkit for analyzing patient referral flows within US healthcare systems using medical claims data (Medicare). We provide functionality for building and analyzing patient referral networks. In particular, we provide functionality to analyze these networks via discrete curvature and persistent homology, in hopes of supporting further research developments into using network analysis to improve efficiency and equity of the US healthcare system.
- Prototype Tool: apparent.topology.rocks
- Paper: Analyzing Physician-Patient Referral Networks Using Discrete Curvature and Persistent Homology
- Build Networks: Generate directed and undirected graphs representing referral relationships between physicians and patients.
- Describe Networks: Compute node- and edge-level features like degree, clustering coefficients, betweenness centrality, curvature measures, and persistence diagrams.
- Compare Networks: Measure pairwise similarity or distances between networks using curvature metrics and topological features.
- Embed Networks: Map networks into a lower-dimensional vector space for visualization and machine learning tasks.
- Cluster Networks: Group networks into clusters based on structural or functional similarity, leveraging techniques like k-means, hierarchical clustering, and DBSCAN.
APPARENT uses uv as the package manager, which provides faster dependency resolution and installation.
git clone https://github.com/aidos-lab/apparent.git
cd apparent
If you don't already have uv
, install with pip:
pip install uv
To install dependencies, run:
uv sync
You'll notice this creates a .venv
folder in the root directory.
We activate that new virtual environment as such:
source .venv/bin/activate
touch .env
echo APPARENT_URL="https://apparent.topology.rocks/us_physician_referral_networks.csv" >> .env
This points the directory to the location where the database is stored.
Most actions can completed using the Apparent
object, including the following functionality:
- Pull Data: Extract data from our datasette tool (or local instance).
- Build Networks: Construct physician-patient referral graphs from the extracted data.
- Describe Networks: Compute network features such as curvature, centrality, and clustering coefficients.
- Compare Networks: Analyze pairwise distances between networks using metrics like Forman curvature and Ollivier-Ricci curvature.
- Embed Networks: Reduce dimensionality for visualization and machine learning.
- Cluster Networks: Group similar networks using clustering algorithms like
KMeans
,DBSCAN
, and hierarchical clustering.
Here's a quick example for how you can pull specific Physician Referral Networks using apparent
!
from apparent import Apparent
# Initialize Apparent
A = Apparent()
# Example SQL query for fetching data
my_query = """
SELECT
hospital_atlas_data.hsa,
hospital_atlas_data.year,
hospital_atlas_data.latitude,
hospital_atlas_data.longitude
FROM
hospital_atlas_data
WHERE
hospital_atlas_data.year = 2017
LIMIT
10;
"""
# Pull data from the database
A.pull(my_query)
# Build referral networks
A.build_networks()
# Compare networks based on Forman curvature
A.compare(measure="forman_curvature")
# Embed networks into a lower-dimensional space
A.embed()
# Cluster networks based on structural similarity
A.cluster_networks()
Contributions are welcome! To contribute:
- Fork the repository.
- Create a new branch (git checkout -b feature-name).
- Commit changes (git commit -m 'Add feature').
- Push to your branch (git push origin feature-name).
- Open a Pull Request.
This project uses pytest
for testing. The tests are divided into two categories: unit
and integration
.
Unit tests run against the live apparent.topology.rocks
service. These tests are run automatically in CI on pushes to main
and develop
.
To run the unit tests locally, you will need to set the APPARENT_URL
environment variable in a .env
file in the root of the project:
echo APPARENT_URL="https://apparent.topology.rocks/us_physician_referral_networks.csv" >> .env
Then, you can run the unit tests:
pytest -m unit
Warning: The remote service has the following known limitations:
- It is not possible to pull the largest networks.
- There may be HTTP errors for oversized queries.
A script is provided to simplify running the integration tests. This script handles:
- Downloading the raw dataset (under
data/us_physician_referral_networks.db
). - Launching a local Datasette server.
- Executing the integration test suite.
Warning: The dataset is large (approximately 8 GB) and may take considerable time to download depending on your internet speed.
To execute the script, run the following command from the root directory:
bash tests/run-integration-tests.sh
This project is licensed under the BSD-3 License. See the LICENSE file for details.
For questions, feedback, or collaboration opportunities please contact the AIODS Lab.