Skip to content

Some statistical terminology fixes #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions jupyter/sum-to-zero/sum_to_zero_evaluation.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ The spatial models are taken from the a set of notebooks available from GitHub r
In this section we consider a model which estimates per-demographic disease prevalence rates for a population.
The model is taken from the Gelman and Carpenter, 2020
[Bayesian Analysis of Tests with Unknown Specificity and Sensitivity](https://doi.org/10.1111/rssc.12435).
It combines a model for multilevel regression and post-stratification with a likelihood that
It combines a model for multilevel regression and post-stratification with a model that
accounts for test sensitivity and specificity.

The data consists of:
Expand Down Expand Up @@ -221,7 +221,7 @@ transformed parameters {
vector[N] p_sample = p * sens + (1 - p) * (1 - spec);
}
model {
pos_tests ~ binomial(tests, p_sample); // likelihood
pos_tests ~ binomial(tests, p_sample); // data model
// scale normal priors on sum_to_zero_vectors
beta_age ~ normal(0, s_age * sigma_age);
beta_eth ~ normal(0, s_eth * sigma_eth);
Expand All @@ -240,7 +240,7 @@ In the `generated quantities` block we use Stan's PRNG functions to populate
vthe true weights for the categorical coefficient vectors, and the relative percentages
of per-category observations.
Then we use a set of nested loops to generate the data for each demographic,
using the PRNG equivalent of the model likelihood.
using the PRNG equivalent of the data model.

The full data-generating program is in file [gen_binomial_4_preds.stan](https://github.com/stan-dev/example-models/tree/master/jupyter/sum-to-zero/stan/binomial_4_preds_ozs.stan).
The helper function `simulate_data` in file `utils.py` sets up the data-generating program
Expand All @@ -249,7 +249,7 @@ observations per category, and baseline disease prevalence, test specificity and
This allows us to create datasets for large and small populations
and for finer or more coarse-grained sets of categories.
The larger the number of strata overall, the more observations are needed to get good coverage.
Because the modeled data `pos_tests` is generated according to the Stan model's likelihood,
Because the modeled data `pos_tests` is generated according to the Stan model,
the model is a priori well-specified with respect to the data.


Expand Down Expand Up @@ -473,15 +473,15 @@ In the case of lots of observations and only a few categories they do a better j
* In almost all cases, estimates for each parameter are the same across implementations to 2 significant figures.
In a few cases they are off by 0.01; where they are off, the percentage of observations for that parameter is correspondingly low.

* The `sum_to_zero_vector` implementation has the highest number of effective samples per second,
* The `sum_to_zero_vector` implementation has the highest effective sample size per second,
excepting a few individual parameters for which the hard sum-to-zero performs equally well.


#### Model efficiency
#### Sampling efficiency

Model efficiency is measured by iterations per second, however, as the draws from the MCMC sampler
may be correlated, we need to compute the number of effective samples across all chains
divided by the total sampling time - this is *ESS_bulk/s*, the effective samples per second.
Sampling efficiency is measured by iterations per second, however, as the draws from the MCMC sampler
may be correlated, we need to compute the effective sample size across all chains
divided by the total sampling time - this is *ESS_bulk/s*, the effective sample size per second.
The following table shows the average runtime for 100 runs
of each of the three models on large and small datasets.
This data was generated by script `eval_efficiencies.py`.
Expand Down Expand Up @@ -784,7 +784,7 @@ brklyn_qns_data = {"N":brklyn_qns_gdf.shape[0],

#### Model fitting

The BYM2 model requires many warmup iterations in order to reach convergence for all parameters,
The BYM2 model requires many warmup iterations in order to MCMC to converge for all parameters,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in order for MCMC to converge"

including hyperparameters `rho` and `sigma`.
We run all three models using the same seed, in order to make the initial parameters as similar
as possible.
Expand Down