-
RNAseq-workflow - A repository for setting up a RNAseq workflow. Detailed instructions and code for each analysis and visualization step.
-
Analysis_STAR.Rmd - RNA-seq analysis pipeline for
STAR
counts. Prerequisites:- A path to data folder. This folder should have 3 subfolders:
02_STAR-align
- gzipped count files with.tab
extension outputted bySTAR
alignerresults
- folder where the results will be storeddata
- Must havesample_annotation.csv
file, example below
- A path to data folder. This folder should have 3 subfolders:
-
GSEA.Rmd - EnrichR (non-directional) and GSEA (directional) analysis using KEGG, GO, MSigDb.
-
GSEA_figures.Rmd - Visualization of GSEA enrichment results as horizontal barplots.
-
Figure_heatmap.Rmd - make heatmap of top 50 differentially expressed genes. Uses
TMP.xlsx
produced byAnalysis*.Rmd
. May use a custom signature of genes. Includes EnhancedVolcano and boxplots of selected genes. -
oncoEnrichR.Rmd - Cancer-dedicated gene set interpretation using the oncoEnrichR R package
-
Pathview.Rmd - visualization of top KEGG pathways. Uses
DEGs.xlsx
produced byAnalysis*.Rmd
. Example -
calcTPM.R - a function to calculate TPMs from gene counts
-
utils.R - helper functions
misc - Outdated scripts
Analysis_featurecounts.Rmd
- RNA-seq analysis pipeline forfeatureCount
counts. Prerequisites:- A path to data folder. This folder should have 3 subfolders:
03_featureCount
- gzipped count files outputted byfeatureCount
results
- folder where the results will be storeddata
- Must havesample_annotation.csv
file. Annotation file should have "Sample" column with sample names, and any other annotation columns. Include "Group" column containing covariate of interest. Example:
- A path to data folder. This folder should have 3 subfolders:
# Sample,Group
VLI10_AA_S61_L006_R1_001.txt.gz,AA
VLI10_AA_S61_L007_R1_001.txt.gz,AA
VLI10_AA_S61_L008_R1_001.txt.gz,AA
VLI11_C_S62_L006_R1_001.txt.gz,C
VLI11_C_S62_L007_R1_001.txt.gz,C
VLI11_C_S62_L008_R1_001.txt.gz,C
-
Figure_clusterProfiler_nes.Rmd
- Takes the results of edgeR analysis from an Excel file, performs GO and KEGG GSEA and plots the results as horizontal barplots, sorted by normalized enrichment score (NES). Example -
Figure_clusterProfiler_asis.Rmd
- Takes the results of edgeR analysis from an Excel file, performs GO and KEGG GSEA and plots the results as horizontal barplots, sorted by p-value, as they come out of the enrichment analysis. -
enrichR_analysis.Rmd - Analyze gene lists using enrichR. Analyze all genes, and up- and downregulated genes separately. Uses
DEGs.xlsx
produced byAnalysis*.Rmd
. -
enrichR_plot.Rmd - barplot of selected enrichment results, similar to Example. WIP
Scripts for running RNA-seq preprocessing steps on a cluster using PBS job submission system. subread-featurecounts
scripts are in the dcaf/ngs.rna-seq repository
- submit00_fastqc.sh - FASTQC on raw FASTQ files
- MultiQC commands to summarize QC reports generated by TrimGalore and STAR
multiqc --filename multiqc_01_trimmed.html --outdir multiqc_01_trimmed 01_trimmed/
multiqc --filename multiqc_02_STAR-align.html --outdir multiqc_02_STAR-align 02_STAR-align/
- submit01_trimgalore.sh - Adapter trimming using TrimGalore
- submit02_STAR-index.sh - Index the genome for the STAR aligner
- submit02_STAR.sh - Align samples using STAR. Requires
input01_toStarAlign.list
text file with the list of input files, each string contains (comma-separated) file name(s), space separates first and second read pairs
CaSpER pipeline detecting CNVs from RNA-seq data
Dedicated repository with detailed instructions: mdozmorov/CaSpER_pipeline
- submit05_BAFExtract-index.sh - indexing the genome for BAFExtract
- submit05_BAFExtract.sh - BAFExtract run
-
DESeq results to pathways in 60 Seconds with the fgsea package, https://stephenturner.github.io/deseq-to-fgsea/
-
A Shiny app for visualizing DESeq2 results by Zuguang Gu. Tweet