Hexp calcuation is erroneous #47

zkamvar · 2015-07-16T18:04:42Z

The problem:

After re-reading Nei (1978), I realized that my implementation of Hexp is:

N/(N - 1) * 1 - sum(p^2)

Where N is the number of allelic states and p is the vector of allele frequencies. Nei's definition is:

kn/(kn - 1) * 1 - sum(p^2)

Where n is the number of observed samples at a locus and k is the ploidy (to account for dosage).

User-facing impacts:

poppr()
locus_table()

What needs to be fixed:

internal function locus_table_pegas()

Impacts after fix:

Polyploids - Because the calculation is dependent on ploidy, this means that it would inappropriate to calculate this for polyploids due to ambiguous dosage.
locus_table(type = "genotype") - Hexp will change to unbiased simpson's index.

Unfortunately, I might have to wait until just before August to submit this patch, lest I anger our CRAN overlords.

The text was updated successfully, but these errors were encountered:

zkamvar · 2015-07-16T23:28:01Z

To ensure that things are working correctly, I will write a test based on the example presented in Kosman (2003).

For haploids and diploids, the calculation will return the size-corrected index. For polyploids, locus_table will return a corrected simpson's index while poppr will return simpson's index. It's all very confusing....

Addresses #47

This addresses #47

more adventures in #47!

This was why I was getting weird values. The clouds are beginning to clear on issue #47

zkamvar · 2015-07-17T03:41:59Z

Currently, the strategy is:

If it's polyploid, change to unbiased Simpson's index over alleles:

(n/(n - 1)) * 1 - sum(pi^2)

This way a measure can actually be reached instead of having complaints of missing data in the result.

Currenlty, locus table will report a different column name, and poppr probably should as well.

Now, I just need to fix the documentation:

adjust documentation in locus_table()
adjust documentation in poppr()
adjust documentation in vignette (that was wrong anyways 😩)

In another thrilling installment of addressing #47, we changed the output of locus_table to be Mu and not uSimpson because it's easier to type and I have a direct reference!

update documentation and tests for #47

ALL FOR THE GLORY OF #47!

zkamvar · 2015-07-17T04:45:05Z

The new column name for poppr and locus_table is Mu.

zkamvar · 2015-07-17T22:31:27Z

Scratch that, reverse it.

The calculation will be

(n/(n - 1)) * 1 - sum(p^2)

where n is the number of observed alleles. This will impact polyploids and mixed ploidy populations by increasing diversity, but it's better than using kN, which would increase it even more.

See issue #47

zkamvar added a commit that referenced this issue Jul 17, 2015

checking for ploidy and correcting.

4c7f7ad

Addresses #47

zkamvar added a commit that referenced this issue Jul 17, 2015

Add test from Kosman, 2003.

a9b9fa8

This addresses #47

zkamvar added a commit that referenced this issue Jul 17, 2015

change alleles to allele

e00fd80

more adventures in #47!

zkamvar added a commit that referenced this issue Jul 17, 2015

dataploid != datploid

54e0aa5

This was why I was getting weird values. The clouds are beginning to clear on issue #47

zkamvar added a commit that referenced this issue Jul 17, 2015

fixed tests for locustable because of #47

fe2f1fe

zkamvar added a commit that referenced this issue Jul 17, 2015

change output from uSimpson to Mu

e703515

In another thrilling installment of addressing #47, we changed the output of locus_table to be Mu and not uSimpson because it's easier to type and I have a direct reference!

zkamvar added a commit that referenced this issue Jul 17, 2015

change poppr output to Mu as well

05aa5b2

update documentation and tests for #47

zkamvar added a commit that referenced this issue Jul 17, 2015

update documentation.

c86fc92

ALL FOR THE GLORY OF #47!

zkamvar added a commit that referenced this issue Jul 17, 2015

Changed equations and tests back.

ea10410

See issue #47

zkamvar mentioned this issue Jul 17, 2015

Hexp fix #48

Merged

zkamvar closed this as completed in d2c2c66 Jul 17, 2015

zkamvar mentioned this issue Jul 18, 2015

add skip_on_cran() to test-filter.R #49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hexp calcuation is erroneous #47

Hexp calcuation is erroneous #47

zkamvar commented Jul 16, 2015

zkamvar commented Jul 16, 2015

zkamvar commented Jul 17, 2015

zkamvar commented Jul 17, 2015

zkamvar commented Jul 17, 2015

Hexp calcuation is erroneous #47

Hexp calcuation is erroneous #47

Comments

zkamvar commented Jul 16, 2015

The problem:

User-facing impacts:

What needs to be fixed:

Impacts after fix:

zkamvar commented Jul 16, 2015

zkamvar commented Jul 17, 2015

zkamvar commented Jul 17, 2015

zkamvar commented Jul 17, 2015