wfmash killed #168

mictadlo · 2024-02-14T07:54:15Z

Description of the bug

Hi,
I always get wfmash killed

Command used and terminal output

conda activate samtools

cat ../genomes/ChinaLab/NbeHZ1_genome_1.0.fa ../genomes/JapanLab/Nbe_v1_scf.fa ../genomes/LAB360/NbLab360.genome.fasta ../genomes/UsaLab/Niben261_genome.fasta > ../genomes/chnVSjapVSauVSusa.fasta

bgzip -@ 4 ../genomes/chnVSjapVSauVSusa.fasta
samtools faidx ../genomes/chnVSjapVSauVSusa.fasta.gz

3.3G chnVSjapVSauVSusa.fasta.gz
2.8M  chnVSjapVSauVSusa.fasta.gz.gzi
854KchnVSjapVSauVSusa.fasta.gz.fai

NXF_OPTS='-Xms1g -Xmx4g'

nextflow run nf-core/pangenome \
      -r 1.0.0 \
      --wfmash_chunks 1200 \
      --input ../genomes/chnVSjapVSauVSusa.fasta.gz \
      --n_haplotypes 4 \
      --outdir results \
      -profile singularity \
      -resume

What do I miss?



### Relevant files
[nextflow.log.txt](https://github.com/nf-core/pangenome/files/14276661/nextflow.log.txt)




### System information

PBSpro, HPC, OpenSuse

The text was updated successfully, but these errors were encountered:

mictadlo · 2024-02-14T07:56:50Z

Please find here the logs here:
nextflow.log.txt

subwaystation · 2024-02-14T11:21:34Z

How much RAM does the node provide? How large are your input sequences again?

mictadlo · 2024-02-14T14:09:00Z

We have various nodes. Some have 300GB, 1TB and one 6TB. The sysadmin told me that the pipeline has 74GB hardcoded.

Each of the FASTA file is around 2.8GB big.

> abyss-fac ChinaLab/NbeHZ1_genome_1.0.fa JapanLab/Nbe_v1_scf.fa  LAB360/NbLab360.genome.fasta UsaLab/Niben261_genome.fasta

n	n:500	L50	min	N75	N50	N25	E-size	max	sum	name
2831	2831	10	1000	136e6	143.2e6	148.2e6	142.1e6	183.5e6	2.929e9	ChinaLab/NbeHZ1_genome_1.0.fa
1668	1668	10	1000	132.1e6	141.7e6	145e6	132.2e6	184.4e6	2.926e9	JapanLab/Nbe_v1_scf.fa
20	20	10	58.94e6	132.9e6	140.6e6	145.2e6	144.4e6	180.8e6	2.806e9	LAB360/NbLab360.genome.fasta
17639	17627	10	522	130.9e6	137.1e6	142.2e6	138.5e6	176.9e6	2.751e9	UsaLab/Niben261_genome.fasta

The combined, compressed and indexed file:

3.3G chnVSjapVSauVSusa.fasta.gz
2.8M chnVSjapVSauVSusa.fasta.gz.gzi
854KchnVSjapVSauVSusa.fasta.gz.fai

subwaystation · 2024-02-15T07:52:32Z

Why did you limit the RAM to 74G? Please try to use more, e.g. 200G until it runs through. You will have to find out how much RAM is required.

mictadlo · 2024-02-16T14:35:33Z

I did not limit the memory. I just submitted with:

NXF_OPTS='-Xms1g -Xmx4g'

nextflow run nf-core/pangenome \
      -r 1.0.0 \
      --wfmash_chunks 1200 \
      --input ../genomes/chnVSjapVSauVSusa.fasta.gz \
      --n_haplotypes 4 \
      --outdir results \
      -profile singularity \
      -resume

PBSpro showed that nf-NFCORE_PANGE requested 73.7GB of memory. It is strange because in the config file it should use 4GB of RAM

    withName:'WFMASH_MAP_ALIGN|WFMASH_MAP|SEQWISH|ODGI_BUILD|ODGI_UNCHOP|ODGI_SORT|ODGI_LAYOUT|WFMASH_MAP_COMMUNITY|ODGI_SQUEEZE' {
        cpus = 4
        memory = 4.GB
    }

Therefore, I don't understand how the 74 GB of memory request has been calculated. Furthermore, doing:

> grep WFMASH_MAP .nextflow.log | wc -l
68

What do I do wrong?

subwaystation · 2024-02-23T10:15:22Z

Hi @mictadlo
I am confused as well...... could you please ask about the unexpected memory allocation on slack? So I can haul in other people.

In the meanwhile, can you try to add --wfmash_exclude_delim "#". Maybe you have a huge number of contigs in your input and this will blow up the search space, if wfmash is not aware, that they actually belong to the same haplotype.

Note for me: I need to bring --wfmash_exclude_delim to the Docs. I somehow missed it.

subwaystation · 2024-02-23T10:26:42Z

Aaah, I put this parameter on hide. Fixing now.

subwaystation · 2024-03-14T10:32:32Z

How are things going here?

mictadlo · 2024-03-14T22:20:39Z

Hi, I did not understand --wfmash_exclude_delim. I removed all unplaced contigs and kept only the chromosomes. Furthermore, I used:

> cat nextflow.config 
process {
    withName:'WFMASH_MAP' {
	memory = 200.GB
	cpus = 24
        time = 100.h
    }
    withName:'ODGI_STATS' {
        memory = 36.GB
        cpus = 1
        time = 24.h
    }
    withName:'SMOOTHXG' {
        memory = 200.GB
        cpus = 24
        time = 50.h
    }
    withName:'GFAFFIX' {
        memory = 48.GB
        cpus = 1
        time = 24.h
    }
}

After the above steps, the pipeline was finished successfully.

subwaystation · 2024-03-15T09:08:24Z

Congrats for a successful run. Happy to hear!

@mictadlo The idea would have been that you format all your sequence names according to https://github.com/pangenome/PanSN-spec. With wfmash_exclude_delim you skip mappings between sequences with the same name prefix before the given delimiter character. This can be helpful if several sequences originate from the same chromosome. In your case the delimiter would then be #. This would save time during the WFMASH_MAP step, but since you finished already, you can neglect this :)

mictadlo added the bug Something isn't working label Feb 14, 2024

mictadlo mentioned this issue Feb 15, 2024

wfmash killed waveygang/wfmash#224

Open

subwaystation closed this as completed Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wfmash killed #168

wfmash killed #168

mictadlo commented Feb 14, 2024 •

edited

Loading

mictadlo commented Feb 14, 2024

subwaystation commented Feb 14, 2024

mictadlo commented Feb 14, 2024

subwaystation commented Feb 15, 2024

mictadlo commented Feb 16, 2024

subwaystation commented Feb 23, 2024

subwaystation commented Feb 23, 2024

subwaystation commented Mar 14, 2024

mictadlo commented Mar 14, 2024

subwaystation commented Mar 15, 2024

wfmash killed #168

wfmash killed #168

Comments

mictadlo commented Feb 14, 2024 • edited Loading

Description of the bug

Command used and terminal output

mictadlo commented Feb 14, 2024

subwaystation commented Feb 14, 2024

mictadlo commented Feb 14, 2024

subwaystation commented Feb 15, 2024

mictadlo commented Feb 16, 2024

subwaystation commented Feb 23, 2024

subwaystation commented Feb 23, 2024

subwaystation commented Mar 14, 2024

mictadlo commented Mar 14, 2024

subwaystation commented Mar 15, 2024

mictadlo commented Feb 14, 2024 •

edited

Loading