Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using win.ia function in poppr #179

Closed
4 of 5 tasks
MarisaMiller opened this issue Apr 22, 2018 · 10 comments
Closed
4 of 5 tasks

Error using win.ia function in poppr #179

MarisaMiller opened this issue Apr 22, 2018 · 10 comments
Assignees

Comments

@MarisaMiller
Copy link

MarisaMiller commented Apr 22, 2018

Please place an "x" in all the boxes that apply

  • I have the most recent version of poppr and R
  • I have found a bug
  • I want to request a new feature

Hello,

I was recently using poppr for analysis of LD with the standardized index of association, and ran into an issue with the win.ia function. I have successfully used the samp.ia function with the same data included below, so I know that function works fine.

Below are the steps of my analysis with comments for clarity. Here is also a link to some Rdata for one of the genlight objects I am analyzing so you can reproduce: https://s3.msi.umn.edu/poppr_files/gl_1990_rmNA_LD.Rdata

#I am using poppr 2.7.1 and R version 3.4.0
#This genlight object was generated with vcfR2genlight from vcfR (1.7.0), and entirely non-typed loci (NAs) were removed

load("gl_1990_rmNA_LD.Rdata")

#I then tried to run the command below, but I get the error below.

win_ia_1990 <- win.ia(gl_1990_rmNA_LD, window = 10000L, min.snps = 3L)

#Error in seq.default(nwin) : 'from' must be a finite number
#In addition: Warning message:
#In `position<-`(`*tmp*`, value = xpos) :
 # NAs introduced by coercion to integer range

I've narrowed it down to the region of the function below. I haven't taken a look at what the adjust_position function does, but it appears related to that. Perhaps there's something wrong with my input file though?

  if (chromos) {
    CHROM <- chromosome(x)
    x <- adjust_position(x, chromosome_buffer, window)
  }

Please let me know if you have any questions or need clarification.


@zkamvar zkamvar self-assigned this Apr 23, 2018
@zkamvar
Copy link
Member

zkamvar commented Apr 29, 2018

Hi @MarisaMiller,

Thank you for reporting this issue! I have done a little checking and it indeed is a bug. Specifically, it's an integer overflow that's due to some poor engineering on my part in December of 2016 (see bb212fd and #117 for details).

Also, thank you for your patience. I've been in a bit of a transition phase and am now able to catch up on these things.

zkamvar added a commit that referenced this issue Apr 30, 2018
zkamvar added a commit that referenced this issue Apr 30, 2018
@MarisaMiller
Copy link
Author

Thanks for being so responsive @zkamvar!

zkamvar added a commit that referenced this issue Apr 30, 2018
zkamvar added a commit that referenced this issue Apr 30, 2018
This further addresses #179

This "works" at the moment, but tests do
not pass, but since I'm going to change the
logic for how chromosome boundaries are set up,
I will need to most likely rewrite the tests.
zkamvar added a commit that referenced this issue Apr 30, 2018
@zkamvar
Copy link
Member

zkamvar commented Apr 30, 2018

No problem! I'll be tagging my commits with this issue number so you can keep track of the progress and I will let you know when I've got something stable.

By the way, if you ever feel like contributing to poppr in the future, feel free to submit a pull request 😸

zkamvar added a commit that referenced this issue May 1, 2018
zkamvar added a commit that referenced this issue May 1, 2018
zkamvar added a commit that referenced this issue May 1, 2018
zkamvar added a commit that referenced this issue May 1, 2018
This addresses #179 [skip ci]

The current tests will not pass.
zkamvar added a commit that referenced this issue May 1, 2018
@zkamvar
Copy link
Member

zkamvar commented May 1, 2018

Hi @MarisaMiller, I believe I have a working version now. Do you want to try it out? You can download it with one of the various remote installer packages in R like devtools, remotes, or ghit:

remotes::install_github("grunwaldlab/poppr@positiion")

Let me know if that works for you and I will merge it into the master branch and will aim to have it on CRAN in a couple of weeks.

@MarisaMiller
Copy link
Author

Hi @zkamvar, I just tested it out and it seems to be working fine! My only suggestion, if not too much trouble, is to also give the window coordinates with the contig (or chromosome) name? This might be useful for extracting the precise coordinates of interesting windows. I think it might be challenging to extract this information after the fact, particularly if the min.snps filter is used and the windows returned are not necessarily contiguous along each contig/chrom. But, this isn't really necessary just a suggestion.

@MarisaMiller
Copy link
Author

I just saw that there are NAs returned in the output of win.ia, so I assume these are windows that don't meet the min.snps cutoff? In that case, returning the coordinates is not really necessary :)

@zkamvar
Copy link
Member

zkamvar commented May 1, 2018

Yes, the NA positions are those that don't meet the min.snps cutoff. You can check this manually:

tmp <- tempfile(fileext = ".Rdata")
download.file("https://s3.msi.umn.edu/poppr_files/gl_1990_rmNA_LD.Rdata", destfile = tmp)
library("poppr")
load(tmp)
chroms <- chromosome(gl_1990_rmNA_LD)
# Calculate for first six chromosomes
win_ia_1990 <- win.ia(gl_1990_rmNA_LD[, chroms %in% head(levels(chroms))], window = 10000L, min.snps = 3L, threads = 4L)

# Split results by chromosomes
win_by_chrom <- split(win_ia_1990, names(win_ia_1990))

# Find the NAs
lapply(win_by_chrom, function(i) which(is.na(i)))
#> $`000000F`
#> named integer(0)
#> 
#> $`000001F`
#> named integer(0)
#> 
#> $`000002F`
#> named integer(0)
#> 
#> $`000003F`
#> named integer(0)
#> 
#> $`000004F`
#> named integer(0)
#> 
#> $`000005F`
#> 000005F 000005F 
#>      80      85

# Show how many loci are in these windows.
# 
# window 80: one locus
nLoc(gl_1990_rmNA_LD[, position(gl_1990_rmNA_LD) %in% 790001:800000 & chromosome(gl_1990_rmNA_LD) == "000005F"])
#> [1] 1
# window 84: thirty loci
nLoc(gl_1990_rmNA_LD[, position(gl_1990_rmNA_LD) %in% 830001:840000 & chromosome(gl_1990_rmNA_LD) == "000005F"])
#> [1] 30
# window 85: one locus
nLoc(gl_1990_rmNA_LD[, position(gl_1990_rmNA_LD) %in% 840001:850000 & chromosome(gl_1990_rmNA_LD) == "000005F"])
#> [1] 1

Created on 2018-05-01 by the reprex package (v0.2.0).

In the new version, the windows should tile from position one all the way to include the last position. The code here shows how the number of windows are calculated from the position vector and the chromosome vector:

poppr/R/bitwise.r

Lines 419 to 424 in cbd21a6

CHROM <- as.character(chromosome(x))
chrom_names <- unique(CHROM)
pos_per_chrom <- split(xpos, CHROM)[chrom_names]
win_per_chrom <- ceiling(vapply(pos_per_chrom, max, integer(1))/window)
names(win_per_chrom) <- chrom_names
nwin <- sum(win_per_chrom)

That being said, I don't see why I can't include an option to add the terminal position of the window to the names of the vector. Thanks for the suggestion!

zkamvar added a commit that referenced this issue May 1, 2018
@zkamvar
Copy link
Member

zkamvar commented May 1, 2018

I've gone ahead and added your suggestion as a new option called name_window, which defaults to TRUE. Please download the new version (with the same command above) and let me know if the output is useful 😁

@MarisaMiller
Copy link
Author

I just tried the new option and it looks great! Thanks for adding that in! I was thinking of doing something like the strategy you posted prior to adding the option, but it wouldn't have been nearly as elegant 😆

zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
This further addresses #179

This "works" at the moment, but tests do
not pass, but since I'm going to change the
logic for how chromosome boundaries are set up,
I will need to most likely rewrite the tests.
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
This addresses #179 [skip ci]

The current tests will not pass.
zkamvar added a commit that referenced this issue May 2, 2018
zkamvar added a commit that referenced this issue May 2, 2018
@zkamvar
Copy link
Member

zkamvar commented May 2, 2018

Awesome! The changes have been merged into the master branch. Thank you again for the very detailed and straightforward report and timely followups.

@zkamvar zkamvar closed this as completed May 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants