Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wfmash killed #168

Closed
mictadlo opened this issue Feb 14, 2024 · 10 comments
Closed

wfmash killed #168

mictadlo opened this issue Feb 14, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@mictadlo
Copy link

mictadlo commented Feb 14, 2024

Description of the bug

Hi,
I always get wfmash killed

Command used and terminal output

conda activate samtools

cat ../genomes/ChinaLab/NbeHZ1_genome_1.0.fa ../genomes/JapanLab/Nbe_v1_scf.fa ../genomes/LAB360/NbLab360.genome.fasta ../genomes/UsaLab/Niben261_genome.fasta > ../genomes/chnVSjapVSauVSusa.fasta

bgzip -@ 4 ../genomes/chnVSjapVSauVSusa.fasta
samtools faidx ../genomes/chnVSjapVSauVSusa.fasta.gz
3.3G chnVSjapVSauVSusa.fasta.gz
2.8M  chnVSjapVSauVSusa.fasta.gz.gzi
854KchnVSjapVSauVSusa.fasta.gz.fai
NXF_OPTS='-Xms1g -Xmx4g'

nextflow run nf-core/pangenome \
      -r 1.0.0 \
      --wfmash_chunks 1200 \
      --input ../genomes/chnVSjapVSauVSusa.fasta.gz \
      --n_haplotypes 4 \
      --outdir results \
      -profile singularity \
      -resume

What do I miss?



### Relevant files
[nextflow.log.txt](https://github.com/nf-core/pangenome/files/14276661/nextflow.log.txt)




### System information

PBSpro, HPC, OpenSuse
@mictadlo mictadlo added the bug Something isn't working label Feb 14, 2024
@mictadlo
Copy link
Author

Please find here the logs here:
nextflow.log.txt

@subwaystation
Copy link
Collaborator

How much RAM does the node provide? How large are your input sequences again?

@mictadlo
Copy link
Author

We have various nodes. Some have 300GB, 1TB and one 6TB. The sysadmin told me that the pipeline has 74GB hardcoded.

Each of the FASTA file is around 2.8GB big.

> abyss-fac ChinaLab/NbeHZ1_genome_1.0.fa JapanLab/Nbe_v1_scf.fa  LAB360/NbLab360.genome.fasta UsaLab/Niben261_genome.fasta

n	n:500	L50	min	N75	N50	N25	E-size	max	sum	name
2831	2831	10	1000	136e6	143.2e6	148.2e6	142.1e6	183.5e6	2.929e9	ChinaLab/NbeHZ1_genome_1.0.fa
1668	1668	10	1000	132.1e6	141.7e6	145e6	132.2e6	184.4e6	2.926e9	JapanLab/Nbe_v1_scf.fa
20	20	10	58.94e6	132.9e6	140.6e6	145.2e6	144.4e6	180.8e6	2.806e9	LAB360/NbLab360.genome.fasta
17639	17627	10	522	130.9e6	137.1e6	142.2e6	138.5e6	176.9e6	2.751e9	UsaLab/Niben261_genome.fasta

The combined, compressed and indexed file:

  • 3.3G chnVSjapVSauVSusa.fasta.gz
  • 2.8M chnVSjapVSauVSusa.fasta.gz.gzi
  • 854KchnVSjapVSauVSusa.fasta.gz.fai

@subwaystation
Copy link
Collaborator

Why did you limit the RAM to 74G? Please try to use more, e.g. 200G until it runs through. You will have to find out how much RAM is required.

@mictadlo
Copy link
Author

I did not limit the memory. I just submitted with:

NXF_OPTS='-Xms1g -Xmx4g'

nextflow run nf-core/pangenome \
      -r 1.0.0 \
      --wfmash_chunks 1200 \
      --input ../genomes/chnVSjapVSauVSusa.fasta.gz \
      --n_haplotypes 4 \
      --outdir results \
      -profile singularity \
      -resume

PBSpro showed that nf-NFCORE_PANGE requested 73.7GB of memory. It is strange because in the config file it should use 4GB of RAM

    withName:'WFMASH_MAP_ALIGN|WFMASH_MAP|SEQWISH|ODGI_BUILD|ODGI_UNCHOP|ODGI_SORT|ODGI_LAYOUT|WFMASH_MAP_COMMUNITY|ODGI_SQUEEZE' {
        cpus = 4
        memory = 4.GB
    }

Therefore, I don't understand how the 74 GB of memory request has been calculated. Furthermore, doing:

> grep WFMASH_MAP .nextflow.log | wc -l
68

What do I do wrong?

@subwaystation
Copy link
Collaborator

Hi @mictadlo
I am confused as well...... could you please ask about the unexpected memory allocation on slack? So I can haul in other people.

In the meanwhile, can you try to add --wfmash_exclude_delim "#". Maybe you have a huge number of contigs in your input and this will blow up the search space, if wfmash is not aware, that they actually belong to the same haplotype.

Note for me: I need to bring --wfmash_exclude_delim to the Docs. I somehow missed it.

@subwaystation
Copy link
Collaborator

Aaah, I put this parameter on hide. Fixing now.

@subwaystation
Copy link
Collaborator

How are things going here?

@mictadlo
Copy link
Author

Hi, I did not understand --wfmash_exclude_delim. I removed all unplaced contigs and kept only the chromosomes. Furthermore, I used:

> cat nextflow.config 
process {
    withName:'WFMASH_MAP' {
	memory = 200.GB
	cpus = 24
        time = 100.h
    }
    withName:'ODGI_STATS' {
        memory = 36.GB
        cpus = 1
        time = 24.h
    }
    withName:'SMOOTHXG' {
        memory = 200.GB
        cpus = 24
        time = 50.h
    }
    withName:'GFAFFIX' {
        memory = 48.GB
        cpus = 1
        time = 24.h
    }
}

After the above steps, the pipeline was finished successfully.

@subwaystation
Copy link
Collaborator

Congrats for a successful run. Happy to hear!

@mictadlo The idea would have been that you format all your sequence names according to https://github.com/pangenome/PanSN-spec. With wfmash_exclude_delim you skip mappings between sequences with the same name prefix before the given delimiter character. This can be helpful if several sequences originate from the same chromosome. In your case the delimiter would then be #. This would save time during the WFMASH_MAP step, but since you finished already, you can neglect this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants