Skip to content

Commit

Permalink
Merge pull request #352 from JoseEspinosa/updates
Browse files Browse the repository at this point in the history
Updates
  • Loading branch information
JoseEspinosa committed Jun 28, 2023
2 parents 3182e0f + 2a5f3f4 commit 3cff26f
Show file tree
Hide file tree
Showing 18 changed files with 162 additions and 152 deletions.
198 changes: 86 additions & 112 deletions conf/modules.config

Large diffs are not rendered by default.

30 changes: 15 additions & 15 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,19 +108,19 @@ The library-level alignments associated with the same sample are merged and subs
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/`
- `<ALIGNER>/merged_library/`
- `*.bam`: Merged library-level, coordinate sorted `*.bam` files after the marking of duplicates, and filtering based on various criteria. The file suffix for the final filtered files will be `*.mLb.clN.*`. If you specify the `--save_align_intermeds` parameter then two additional sets of files will be present. These represent the unfiltered alignments with duplicates marked (`*.mLb.mkD.*`), and in the case of paired-end datasets the filtered alignments before the removal of orphan read pairs (`*.mLb.flT.*`).
- `<ALIGNER>/mergedLibrary/samtools_stats/`
- `<ALIGNER>/merged_library/samtools_stats/`
- SAMtools `*.flagstat`, `*.idxstats` and `*.stats` files generated from the alignment files.
- `<ALIGNER>/mergedLibrary/picard_metrics/`
- `<ALIGNER>/merged_library/picard_metrics/`
- `*_metrics`: Alignment QC files from picard CollectMultipleMetrics.
- `*.metrics.txt`: Metrics file from MarkDuplicates.
- `<ALIGNER>/mergedLibrary/picard_metrics/pdf/`
- `<ALIGNER>/merged_library/picard_metrics/pdf/`
- `*.pdf`: Alignment QC plot files from picard CollectMultipleMetrics.
- `<ALIGNER>/mergedLibrary/preseq/`
- `<ALIGNER>/merged_library/preseq/`
- `*.lc_extrap.txt`: Preseq expected future yield file.

> **NB:** File names in the resulting directory (i.e. `<ALIGNER>/mergedLibrary/`) will have the '`.mLb.`' suffix.
> **NB:** File names in the resulting directory (i.e. `<ALIGNER>/merged_library/`) will have the '`.mLb.`' suffix.
</details>

Expand All @@ -141,7 +141,7 @@ The [Preseq](http://smithlabresearch.org/software/preseq/) package is aimed at p
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/bigwig/`
- `<ALIGNER>/merged_library/bigwig/`
- `*.bigWig`: Normalised bigWig files scaled to 1 million mapped reads.

</details>
Expand All @@ -153,12 +153,12 @@ The [bigWig](https://genome.ucsc.edu/goldenpath/help/bigWig.html) format is in a
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/phantompeakqualtools/`
- `<ALIGNER>/merged_library/phantompeakqualtools/`
- `*.spp.out`, `*.spp.pdf`: phantompeakqualtools output files.
- `*_mqc.tsv`: MultiQC custom content files.
- `<ALIGNER>/mergedLibrary/deepTools/plotFingerprint/`
- `<ALIGNER>/merged_library/deepTools/plotFingerprint/`
- `*.plotFingerprint.pdf`, `*.plotFingerprint.qcmetrics.txt`, `*.plotFingerprint.raw.txt`: plotFingerprint output files.
- `<ALIGNER>/mergedLibrary/deepTools/plotProfile/`
- `<ALIGNER>/merged_library/deepTools/plotProfile/`
- `*.computeMatrix.mat.gz`, `*.computeMatrix.vals.mat.tab`, `*.plotProfile.pdf`, `*.plotProfile.tab`, `*.plotHeatmap.pdf`, `*.plotHeatmap.mat.tab`: plotProfile output files.

</details>
Expand Down Expand Up @@ -188,10 +188,10 @@ The results from deepTools plotProfile gives you a quick visualisation for the g
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/macs2/<PEAK_TYPE>/`
- `<ALIGNER>/merged_library/macs2/<PEAK_TYPE>/`
- `*.xls`, `*.broadPeak` or `*.narrowPeak`, `*.gappedPeak`, `*summits.bed`: MACS2 output files - the files generated will depend on whether MACS2 has been run in _narrowPeak_ or _broadPeak_ mode.
- `*.annotatePeaks.txt`: HOMER peak-to-gene annotation file.
- `<ALIGNER>/mergedLibrary/macs2/<PEAK_TYPE>/qc/`
- `<ALIGNER>/merged_library/macs2/<PEAK_TYPE>/qc/`
- `macs2_peak.plots.pdf`: QC plots for MACS2 peaks.
- `macs2_annotatePeaks.plots.pdf`: QC plots for peak-to-gene feature annotation.
- `*.FRiP_mqc.tsv`, `*.peak_count_mqc.tsv`, `annotatepeaks.summary_mqc.tsv`: MultiQC custom-content files for FRiP score, peak count and peak-to-gene ratios.
Expand All @@ -217,7 +217,7 @@ Various QC plots per sample including number of peaks, fold-change distribution,
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/`
- `<ALIGNER>/merged_library/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/`
- `*.bed`: Consensus peak-set across all samples in BED format.
- `*.saf`: Consensus peak-set across all samples in SAF format. Required by featureCounts for read quantification.
- `*.featureCounts.txt`: Read counts across all samples relative to consensus peak-set.
Expand Down Expand Up @@ -245,7 +245,7 @@ The [featureCounts](http://bioinf.wehi.edu.au/featureCounts/) tool is used to co
<details markdown="1">
<summary>Output files</summary>

- `<ALIGNER>/mergedLibrary/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/deseq2/`
- `<ALIGNER>/merged_library/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/deseq2/`
- `*.sample.dists.txt`: Spreadsheet containing sample-to-sample distance across each consensus peak.
- `*.plots.pdf`: File containing PCA and hierarchical clustering plots.
- `*.dds.RData`: File containing R `DESeqDataSet` object generated by DESeq2, with either
Expand All @@ -254,7 +254,7 @@ The [featureCounts](http://bioinf.wehi.edu.au/featureCounts/) tool is used to co
`readRDS` to give user control of the eventual object name.
- `*pca.vals.txt`: Matrix of values for the first 2 principal components.
- `R_sessionInfo.log`: File containing information about R, the OS and attached or loaded packages.
- `<ALIGNER>/mergedLibrary/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/sizeFactors/`
- `<ALIGNER>/merged_library/macs2/<PEAK_TYPE>/consensus/<ANTIBODY>/sizeFactors/`
- `*.txt`, `*.RData`: Files containing DESeq2 sizeFactors per sample.

</details>
Expand Down
3 changes: 3 additions & 0 deletions modules/local/bam_remove_orphans.nf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ process BAM_REMOVE_ORPHANS {
tuple val(meta), path("${prefix}.bam"), emit: bam
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script: // This script is bundled with the pipeline, in nf-core/chipseq/bin/
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: "${meta.id}"
Expand Down
8 changes: 5 additions & 3 deletions modules/local/bedtools_genomecov.nf
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,13 @@ process BEDTOOLS_GENOMECOV {
tuple val(meta), path("*.txt") , emit: scale_factor
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"

def pe = meta.single_end ? '' : '-pc'
def extend = (meta.single_end && params.fragment_size > 0) ? "-fs ${params.fragment_size}" : ''
"""
SCALE_FACTOR=\$(grep '[0-9] mapped (' $flagstat | awk '{print 1000000/\$1}')
echo \$SCALE_FACTOR > ${prefix}.scale_factor.txt
Expand All @@ -30,7 +32,7 @@ process BEDTOOLS_GENOMECOV {
-bg \\
-scale \$SCALE_FACTOR \\
$pe \\
$extend \\
$args \\
| sort -T '.' -k1,1 -k2,2n > ${prefix}.bedGraph
cat <<-END_VERSIONS > versions.yml
Expand Down
8 changes: 5 additions & 3 deletions modules/local/deseq2_qc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,12 @@ process DESEQ2_QC {
path "size_factors" , optional:true, emit: size_factors
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def peak_type = params.narrow_peak ? 'narrowPeak' : 'broadPeak'
def prefix = task.ext.prefix ?: "${meta.id}"
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
deseq2_qc.r \\
--count_file $counts \\
Expand Down
3 changes: 3 additions & 0 deletions modules/local/frip_score.nf
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ process FRIP_SCORE {
tuple val(meta), path("*.txt"), emit: txt
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
Expand Down
3 changes: 3 additions & 0 deletions modules/local/genome_blacklist_regions.nf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ process GENOME_BLACKLIST_REGIONS {
path '*.bed' , emit: bed
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script:
def file_out = "${sizes.simpleName}.include_regions.bed"
if (blacklist) {
Expand Down
3 changes: 3 additions & 0 deletions modules/local/gtf2bed.nf
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ process GTF2BED {
path '*.bed' , emit: bed
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script: // This script is bundled with the pipeline, in nf-core/chipseq/bin/
"""
gtf2bed \\
Expand Down
4 changes: 4 additions & 0 deletions modules/local/igv.nf
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,12 @@ process IGV {
output:
path "*files.txt" , emit: txt
path "*.xml" , emit: xml
path fasta , emit: fasta
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script: // scripts are bundled with the pipeline in nf-core/chipseq/bin/
def consensus_dir = "${aligner_dir}/mergedLibrary/macs2/${peak_dir}/consensus/*"
"""
Expand Down
3 changes: 3 additions & 0 deletions modules/local/multiqc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ process MULTIQC {
path "*_plots" , optional:true, emit: plots
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def custom_config = params.multiqc_config ? "--config $mqc_custom_config" : ''
Expand Down
3 changes: 3 additions & 0 deletions modules/local/multiqc_custom_peaks.nf
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ process MULTIQC_CUSTOM_PEAKS {
tuple val(meta), path("*.peak_count_mqc.tsv"), emit: count
tuple val(meta), path("*.FRiP_mqc.tsv") , emit: frip

when:
task.ext.when == null || task.ext.when

script:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
Expand Down
3 changes: 3 additions & 0 deletions modules/local/multiqc_custom_phantompeakqualtools.nf
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ process MULTIQC_CUSTOM_PHANTOMPEAKQUALTOOLS {
tuple val(meta), path("*.spp_rsc_mqc.tsv") , emit: rsc
tuple val(meta), path("*.spp_correlation_mqc.tsv"), emit: correlation

when:
task.ext.when == null || task.ext.when

script:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
Expand Down
3 changes: 3 additions & 0 deletions modules/local/plot_homer_annotatepeaks.nf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ process PLOT_HOMER_ANNOTATEPEAKS {
path '*.tsv' , emit: tsv
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script: // This script is bundled with the pipeline, in nf-core/chipseq/bin/
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "annotatepeaks"
Expand Down
6 changes: 5 additions & 1 deletion modules/local/plot_macs2_qc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,19 @@ process PLOT_MACS2_QC {

input:
path peaks
val is_narrow_peak

output:
path '*.txt' , emit: txt
path '*.pdf' , emit: pdf
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script: // This script is bundled with the pipeline, in nf-core/chipseq/bin/
def args = task.ext.args ?: ''
def peak_type = params.narrow_peak ? 'narrowPeak' : 'broadPeak'
def peak_type = is_narrow_peak ? 'narrowPeak' : 'broadPeak'
"""
plot_macs2_qc.r \\
-i ${peaks.join(',')} \\
Expand Down
3 changes: 2 additions & 1 deletion modules/local/samplesheet_check.nf
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@ process SAMPLESHEET_CHECK {
task.ext.when == null || task.ext.when

script: // This script is bundled with the pipeline, in nf-core/chipseq/bin/
def args = task.ext.args ?: ''
"""
check_samplesheet.py \\
$samplesheet \\
samplesheet.valid.csv
$args
cat <<-END_VERSIONS > versions.yml
"${task.process}":
Expand Down
3 changes: 3 additions & 0 deletions modules/local/star_genomegenerate.nf
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ process STAR_GENOMEGENERATE {
path "star" , emit: index
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = (task.ext.args ?: '').tokenize()
def memory = task.memory ? "--limitGenomeGenerateRAM ${task.memory.toBytes() - 100000000}" : ''
Expand Down
21 changes: 8 additions & 13 deletions subworkflows/local/prepare_genome.nf
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,7 @@ workflow PREPARE_GENOME {
ch_fasta = GUNZIP_FASTA ( [ [:], params.fasta ] ).gunzip.map{ it[1] }
ch_versions = ch_versions.mix(GUNZIP_FASTA.out.versions)
} else {
ch_fasta = file(params.fasta)
}

// Make fasta file available if reference saved or IGV is run
if (params.save_reference || !params.skip_igv) {
file("${params.outdir}/genome/").mkdirs()
ch_fasta.copyTo("${params.outdir}/genome/")
ch_fasta = Channel.value(file(params.fasta))
}

//
Expand Down Expand Up @@ -107,14 +101,15 @@ workflow PREPARE_GENOME {
ch_gene_bed = GUNZIP_GENE_BED ( [ [:], params.gene_bed ] ).gunzip.map{ it[1] }
ch_versions = ch_versions.mix(GUNZIP_GENE_BED.out.versions)
} else {
ch_gene_bed = file(params.gene_bed)
ch_gene_bed = Channel.value(file(params.gene_bed))
}
}

//
// Create chromosome sizes file
//
ch_chrom_sizes = CUSTOM_GETCHROMSIZES ( [ [:], ch_fasta ] ).sizes.map{ it[1] }
CUSTOM_GETCHROMSIZES ( ch_fasta.map { [ [:], it ] } )
ch_chrom_sizes = CUSTOM_GETCHROMSIZES.out.sizes.map { it[1] }
ch_fai = CUSTOM_GETCHROMSIZES.out.fai.map{ it[1] }
ch_versions = ch_versions.mix(CUSTOM_GETCHROMSIZES.out.versions)

Expand Down Expand Up @@ -144,7 +139,7 @@ workflow PREPARE_GENOME {
ch_bwa_index = file(params.bwa_index)
}
} else {
ch_bwa_index = BWA_INDEX ( [ [:], ch_fasta ] ).index
ch_bwa_index = BWA_INDEX ( ch_fasta.map { [ [:], it ] } ).index
ch_versions = ch_versions.mix(BWA_INDEX.out.versions)
}
}
Expand All @@ -162,7 +157,7 @@ workflow PREPARE_GENOME {
ch_bowtie2_index = [ [:], file(params.bowtie2_index) ]
}
} else {
ch_bowtie2_index = BOWTIE2_BUILD ( [ [:], ch_fasta ] ).index
ch_bowtie2_index = BOWTIE2_BUILD ( ch_fasta.map { [ [:], it ] } ).index
ch_versions = ch_versions.mix(BOWTIE2_BUILD.out.versions)
}
}
Expand All @@ -180,7 +175,7 @@ workflow PREPARE_GENOME {
ch_chromap_index = [ [:], file(params.chromap_index) ]
}
} else {
ch_chromap_index = CHROMAP_INDEX ( [ [:], ch_fasta ] ).index
ch_chromap_index = CHROMAP_INDEX ( ch_fasta.map { [ [:], it ] } ).index
ch_versions = ch_versions.mix(CHROMAP_INDEX.out.versions)
}
}
Expand All @@ -195,7 +190,7 @@ workflow PREPARE_GENOME {
ch_star_index = UNTAR_STAR_INDEX ( [ [:], params.star_index ] ).untar.map{ it[1] }
ch_versions = ch_versions.mix(UNTAR_STAR_INDEX.out.versions)
} else {
ch_star_index = file(params.star_index)
ch_star_index = Channel.value(file(params.star_index))
}
} else {
ch_star_index = STAR_GENOMEGENERATE ( ch_fasta, ch_gtf ).index
Expand Down
9 changes: 5 additions & 4 deletions workflows/chipseq.nf
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ workflow CHIPSEQ {
ch_samtools_stats = FASTQ_ALIGN_BOWTIE2.out.stats
ch_samtools_flagstat = FASTQ_ALIGN_BOWTIE2.out.flagstat
ch_samtools_idxstats = FASTQ_ALIGN_BOWTIE2.out.idxstats
ch_versions = ch_versions.mix(FASTQ_ALIGN_BOWTIE2.out.versions.first())
ch_versions = ch_versions.mix(FASTQ_ALIGN_BOWTIE2.out.versions)
}

//
Expand All @@ -229,7 +229,7 @@ workflow CHIPSEQ {
ch_samtools_stats = FASTQ_ALIGN_CHROMAP.out.stats
ch_samtools_flagstat = FASTQ_ALIGN_CHROMAP.out.flagstat
ch_samtools_idxstats = FASTQ_ALIGN_CHROMAP.out.idxstats
ch_versions = ch_versions.mix(FASTQ_ALIGN_CHROMAP.out.versions.first())
ch_versions = ch_versions.mix(FASTQ_ALIGN_CHROMAP.out.versions)
}

//
Expand Down Expand Up @@ -274,7 +274,7 @@ workflow CHIPSEQ {
PICARD_MERGESAMFILES (
ch_sort_bam
)
ch_versions = ch_versions.mix(PICARD_MERGESAMFILES.out.versions.first().ifEmpty(null))
ch_versions = ch_versions.mix(PICARD_MERGESAMFILES.out.versions.first())

//
// SUBWORKFLOW: Mark duplicates & filter BAM files after merging
Expand Down Expand Up @@ -549,7 +549,8 @@ workflow CHIPSEQ {
// MODULE: MACS2 QC plots with R
//
PLOT_MACS2_QC (
ch_macs2_peaks.collect{it[1]}
ch_macs2_peaks.collect{it[1]},
params.narrow_peak
)
ch_versions = ch_versions.mix(PLOT_MACS2_QC.out.versions)

Expand Down

0 comments on commit 3cff26f

Please sign in to comment.