VIRSorter_global_phage_signal.csv is not machine readable #37

rec3141 · 2019-03-22T17:24:05Z

Hi, thanks for your program, it's very useful but a small change that would make it more so would be to make the output file easily machine readable (e.g. directly importable into R or python). Right now the comments and headers are interspersed but there's really no need for that. You also have a "Category" column with non-distinct numbers (e.g. "Phage Category 1" and "Prophage Category 1" get the same entry). You also mangle the FASTA headers from the original file (e.g. replacing '.' with '_') which makes it unnecessarily more difficult to match up to the original data files. thanks

rec3141 · 2019-03-22T18:54:13Z

in case anyone else is working in R, here's how I imported it


vs.pred <- read.csv(virsorterfile,quote="",head=F)
vs.head <- read.table(virsorterfile,sep=",",quote="",head=T,comment="",skip=1,nrows=1)
colnames(vs.pred) <- colnames(vs.head)
colnames(vs.pred)[1] <- "vs.id"
vs.cats <- do.call(rbind,strsplit(x=as.character(vs.pred$vs.id[grep("category",vs.pred$vs.id)]),split=" - ",fixed=T))[,2]
vs.num <- grep("category",vs.pred$vs.id)
vs.pred$Category <- paste(c("",rep.int(vs.cats, c(vs.num[-1],nrow(vs.pred)) - vs.num)), vs.pred$Category)
vs.pred <- vs.pred[-grep("#",vs.pred$vs.id),]

vs.pred$node <- gsub(pattern="VIRSorter_",replacement="",x=vs.pred$vs.id)
vs.pred$node <- gsub(pattern="-circular",replacement="",x=vs.pred$node)
vs.pred$node <- gsub(pattern="cov_(\\d+)_",replacement="cov_\\1.",x=vs.pred$node,perl=F)

simroux · 2019-03-25T21:08:41Z

Hi,

Thanks for the suggestion, and thanks a lot for sharing the R Code to import VirSorter results. Unfortunately, there is no support for VirSorter development anymore, so I can't commit on any timeframe by which these different issues may be fixed, but I have linked the R code in the Readme to help any user which would like to do the same type of import.

Best,
Simon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VIRSorter_global_phage_signal.csv is not machine readable #37

VIRSorter_global_phage_signal.csv is not machine readable #37

rec3141 commented Mar 22, 2019

rec3141 commented Mar 22, 2019

simroux commented Mar 25, 2019

VIRSorter_global_phage_signal.csv is not machine readable #37

VIRSorter_global_phage_signal.csv is not machine readable #37

Comments

rec3141 commented Mar 22, 2019

rec3141 commented Mar 22, 2019

simroux commented Mar 25, 2019