Open
Description
I am trying to filter variants by chromosome and position.
Here's my code:
fn = ps.seqExample('1KG_autosomes_phase3_shapeit2_mvncall_integrated_v5_20130502_lzma.seq.gds')
f = ps.SeqArrayFile()
f.open(fn)
f.FilterReset()
# filter for chromosome
chrs_all = f.GetData('chromosome')
chrs_sel = chrs_all == '22'
f.FilterSet2(variant=chrs_sel)
# returns: # of selected variants: 1,103,547
pos_on_chrom = f.GetData('position')
# there are 3 variants with positions < 16050319
f.FilterSet2(variant=pos_on_chrom<16050319, intersect=True)
# returns: # of selected variants: 1,103,547
Given that there 3 variants with position < 16050319 shouldnt the last call to FilterSet2 return 3 variants?
Update
Of course some simple boolean slicing does the job, maybe I misunderstood the intention of intersect=True
?
sel = np.zeros_like(chrs_all, dtype=bool)
np.putmask(sel, chrs_sel, pos_on_chrom < 16050319) # there are 3 variants with positions < 16050319
f.FilterSet2(variant=sel)
# returns: # of selected variants: 3
Metadata
Metadata
Assignees
Labels
No labels