rattle correction step giving error #50

dvirdi01 · 2023-10-10T17:09:33Z

I ran rattle correct on my input files through snakemake. I get an error message saying this:

Error in rule cluster_correction:
jobid: 13
input: data/.../.../samplefile.fastq
output: data/RATTLE_out/samplefile/corrected.fq, data/RATTLE_out/samplefile/uncorrected.fq, data/RATTLE_out/samplefile/consensi.fq
log: log/RATTLE_log/samplefile_correct.out, log/RATTLE_log/samplefile_correct.err (check log file(s) for error details)
shell:
/storage/.../.../bin/RATTLE/rattle correct -i data/.../.../samplefile.fastq -c data/RATTLE_out/samplefile/clusters.out -o data/RATTLE_out/samplefile/corrected.fq data/RATTLE_out/samplefile/uncorrected.fq data/RATTLE_out/samplefile/consensi.fq -t 48 > log/RATTLE_log/samplefile.out 2> log/RATTLE_log/samplefile.err
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Error executing rule cluster_correction on cluster (jobid: 13, external: 2761217, jobscript: /storage/.../.../.../.snakemake/tmp.tz0fhacf/snakejob.cluster_correction.13.sh). For error details see the cluster log and the log files of the involved rule(s).

When I open samplefile.err it says: "Reading fasta file... Done" and when I open samplefile.out it is empty.

I also get this message below:

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=2761217.0 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: valiant1: task 0: Out Of Memory
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=2761217.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

I gave it 100GB ram to begin with but I guess it wasn't enough. Is there a way to know how much ram I need to give it before I run the snakemake command?

eileen-xue · 2023-10-11T01:22:03Z

Hi,

You can have a look at the memory usage figure in our paper.

Otherwise, I need more information like the number of reads or the fastq file size to give you a RAM estimation.

Your 'samplefile.out' file should not be empty, because it is a binary file. You need to look at the file size to check whether it is empty.

Eileen

dvirdi01 · 2023-10-11T15:42:49Z

Hi, I checked the output files for some of the processes that did run. IIt created consensi.fq, uncorrected,fq and corrected.fq hbut they are all 0 bytes. I am not sure why this is happening. This was my snakemake command:

rule cluster_correction:
input: "data/.../.../{sample}.fastq"
output:
touch("data/.../{sample}/corrected.fq"),
touch("data/.../{sample}/uncorrected.fq"),
touch("data/.../{sample}/consensi.fq")
params:
clusters = "data/.../{sample}/clusters.out"
log:
out = "log/.../{sample}_correct.out",
err = "log/.../{sample}_correct.err"
threads:
48
resources:
mem = 100
shell:
"""/storage/.../.../.../.../rattle correct
-i {input}
-c {params.clusters}
-o {output}
-t {threads}
> {log.out}
2> {log.err}
"""

To add on: the same happened with my rattle cluster_summary step- it created a tsv file but it was also 0 bytes.

eileen-xue · 2023-10-12T00:36:56Z

Hi,

This problem seems not from the error correction step but from the clustering step.

Please provide answers to the following questions to help us identify the issues and provide solutions.

Is your clustering step output (clusters.out) file size 0 bytes?
What is your clustering step command? And what is the log for your clustering step?
Do you meet the out-of-memory issue with your clustering step? Normally, clustering uses more memory than error correction.

Eileen

dvirdi01 · 2023-10-12T15:37:06Z

none of my clusters.out files are 0 bytes so I think cluster and cluster extraction steps were working
this was my rule for clustering step:

input: "data/.../..../{samle}.fastq.gz"
output:
touch("data/.../{sample}.done")
params:
outdir = "data/..../{sample}"
log:
out = "log/.../{sample}.out",
err = "log/.../{sample}.err"
threads:
48
resources:
mem = 200
shell:
"""mkdir -p {params.outdir};
/storage/.../.../.../.../rattle cluster
--input {input}
--output {params.outdir}
--threads {threads}
--verbose
> {log.out}
2> {log.err}"""

In my log, my sample.out file says "Reads: ...some number..." and my sample.err says: [================================================================================] 67715/67715 (100%)85%)
Iteration 0.3 complete
[================================================================================] 24054/24054 (100%)58%)
Iteration 0.25 complete
[================================================================================] 11360/11360 (100%)12%)
Iteration 0.2 complete
[================================================================================] 7204/7204 (100%)61%)
Iteration 0 complete
Gene clustering done
5507 gene clusters found

I think I did for some of the files. For those I re-ran it by allocating more memory.

eileen-xue · 2023-10-13T03:01:32Z

Hi,

Your RATTLE error correction step command is incorrect. To specify the outputs, you don't need to list all the output files' names and locations. Only need an output folder location, like -o [out_dir]

Hope this helps.
Eileen

dvirdi01 · 2023-10-13T15:39:45Z

Hi, isn't that what I did though? I gave the output file location as params.outdir?

Edit: Oh I think I get what you were saying
I had this earlier for my error correction step in my smk file:

output:
touch("data/.../{sample}/corrected.fq"),
touch("data/.../{sample}/uncorrected.fq"),
touch("data/.../{sample}/consensi.fq")

but I should change it to-

output:
touch("data/.../{sample}")

Is this ^ what you meant? Also In my snakefile I had:

rule all:
    input:
       expand("data/..../{sample}/{filename}.fq",  
       sample = config['samples'], filename = ["corrected", "uncorrected", "consensi"])

Would I need to change the expand command in my snakefile?

Also, how about my cluster_summary.tsv file being empty? Was it due to the same error? I did not run the cluster extraction and cluster summary step from snakemake but I ran it directly from command line for all my files. This is what I had:

./rattle extract_clusters -i /storage/.../.../.../.../.../.../sample.fastq  -c /storage/.../.../.../.../.../sample/clusters.out -o /storage/.../..../.../.../.../sample/clusters --fastq

./rattle cluster_summary -i /storage/.../.../.../.../.../.../sample.fastq -c /storage/.../.../.../.../.../sample/clusters.out > /storage/.../.../.../.../.../sample/cluster_summary.tsv

Why did this command produce an empty tsv file?

eileen-xue · 2023-10-16T01:39:09Z

Your new output command is correct.
If you want to use multiple fastq files as input, the format should be -i input_1.fq,input_2.fq,...,input_n.fq. All files must be separated by comma, no space or line break is allowed. Don't use Snakenmake expand for RATTLE input, expand will create new lines.
Also, I don't understand why using corrected.fq, uncorrected.fq, consensi.fq as input. This will make your input and output the exact same file.
Your command looks correct.
Possible issues:
Inputs of the cluster step and cluster_summary step are not the same.
Your input.fastq file or clusters.out file location is incorrect.

dvirdi01 · 2023-10-16T02:27:14Z

Hi, thanks for the reply. I didn’t understand why I would need to skip the extract clusters step. Wouldn’t that step be necessary to do the next step which is cluster correction

…

On Sun, Oct 15, 2023 at 7:39 PM Eileen Xue ***@***.***> wrote: 1. Your new output command is correct. If you want to use multiple fastq files as input, the format should be -i input_1.fq,input_2.fq,...,input_n.fq. All files must be separated by comma, no space or line break is allowed. Don't use Snakenmake expand for RATTLE input, expand will create new lines. Also, I don't understand why using corrected.fq, uncorrected.fq, consensi.fq as input. This will make your input and output the exact same file. 2. Your command looks correct. You can skip extract_clusters Possible issues: Inputs of the cluster step and cluster_summary step are not the same. Your input.fastq file and clusters.out file location is incorrect. — Reply to this email directly, view it on GitHub <#50 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXXIJCHCL5JD6Z6IIAXESX3X7SF4PAVCNFSM6AAAAAA52V7L2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRTGU4TMOJWGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

eileen-xue · 2023-10-16T02:54:42Z

extract_clusters and cluster_summary are designed to make cluster step results readable. Only the cluster step is necessary step before the correction step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rattle correction step giving error #50

rattle correction step giving error #50

dvirdi01 commented Oct 10, 2023 •

edited

Loading

eileen-xue commented Oct 11, 2023 •

edited

Loading

dvirdi01 commented Oct 11, 2023 •

edited

Loading

eileen-xue commented Oct 12, 2023

dvirdi01 commented Oct 12, 2023 •

edited

Loading

eileen-xue commented Oct 13, 2023

dvirdi01 commented Oct 13, 2023 •

edited

Loading

eileen-xue commented Oct 16, 2023 •

edited

Loading

dvirdi01 commented Oct 16, 2023 via email

eileen-xue commented Oct 16, 2023

rattle correction step giving error #50

rattle correction step giving error #50

Comments

dvirdi01 commented Oct 10, 2023 • edited Loading

eileen-xue commented Oct 11, 2023 • edited Loading

dvirdi01 commented Oct 11, 2023 • edited Loading

eileen-xue commented Oct 12, 2023

dvirdi01 commented Oct 12, 2023 • edited Loading

eileen-xue commented Oct 13, 2023

dvirdi01 commented Oct 13, 2023 • edited Loading

eileen-xue commented Oct 16, 2023 • edited Loading

dvirdi01 commented Oct 16, 2023 via email

eileen-xue commented Oct 16, 2023

dvirdi01 commented Oct 10, 2023 •

edited

Loading

eileen-xue commented Oct 11, 2023 •

edited

Loading

dvirdi01 commented Oct 11, 2023 •

edited

Loading

dvirdi01 commented Oct 12, 2023 •

edited

Loading

dvirdi01 commented Oct 13, 2023 •

edited

Loading

eileen-xue commented Oct 16, 2023 •

edited

Loading