Skip to content

Speeding up whichND

Zakariyya Mughal edited this page Jan 5, 2015 · 1 revision

If you have code like this:

$im = rfits('some_image.fits')
for $val ($im->uniq->list){
    next unless $val;
    $coord = whichND($im==$val)->qsortvec;
##do something with $coord and $val and $im
}

where $im is a large image (1024 x 1024, for example) that is mostly zeroes, but has regions of integer labels, sort of like this,:

[
[ 0 0 1 1 0 0]
[ 2 0 0 1 0 0]
[ 2 2 0 0 0 3]
[ 0 0 0 0 3 0]
]

then you may notice that that code is very slow, because the entire image $im must be searched for matches $im->uniq->nelem-1 times. In one example where there were ~2500 unique values, the above code fragment took about 30 seconds to run, and that doesn't even include doing anything interesting!

An alternate method of getting the $coord piddles is below:

use PDL::Math; #not included in PDL::LiteF, needed for floor() 
my $vind = $im->flat->qsorti;  #because of the qsort, similarly-labeled pixels of interest don't even have to be contiguous
my $vals = $im->flat->index($vind)->sever; #we can avoid two qsort calls

my ($x,$y) = $im->one2nd($vind);

my ($v_r_a,$v_r_b)=rle($vals);  #run-length-encode the $vals
#rle says only the elements up to the first 0 in $v_r_a should be considered, so we pare those down
($v_r_a,$v_r_b)=where($v_r_a,$v_r_b,$v_r_a!=0);

my $vra_cumsum = cumusumover($v_r_a);

my ($val,$coord);
for my $i(0..$v_r_b->nelem-1){
  $val = $v_r_b->(($i));
  next unless $val;
  my ($start,$end);
  $end = $vra_cumsum(($i))-1;
  $start = $i ? $vra_cumsum(($i-1)) : 0;  #trinary op needed for the first element because of how rle() does counting.
  $coord = $x($start:$end)->glue(1,$y($start:$end))->transpose->qsortvec;
}

The qsortvec at the end there is probably extraneous, but was useful for comparing the output of this method with the original whichND code. Notice there is no check for equality inside the loop (or anywhere at all except the $v_r_a and _b pare-down, and I suppose implicitly in the qsorts). Everything is accomplished using slices. This code fragment takes less than 0.5 seconds, a factor of 60 improvement in speed. Obviously this will only work for sparse images like my sample above. Generalizing it to more than 2 dimensions, shouldn't be too hard.

Clone this wiki locally