Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIRSorter_global_phage_signal.csv is not machine readable #37

Open
rec3141 opened this issue Mar 22, 2019 · 2 comments
Open

VIRSorter_global_phage_signal.csv is not machine readable #37

rec3141 opened this issue Mar 22, 2019 · 2 comments

Comments

@rec3141
Copy link

rec3141 commented Mar 22, 2019

Hi, thanks for your program, it's very useful but a small change that would make it more so would be to make the output file easily machine readable (e.g. directly importable into R or python). Right now the comments and headers are interspersed but there's really no need for that. You also have a "Category" column with non-distinct numbers (e.g. "Phage Category 1" and "Prophage Category 1" get the same entry). You also mangle the FASTA headers from the original file (e.g. replacing '.' with '_') which makes it unnecessarily more difficult to match up to the original data files. thanks

@rec3141
Copy link
Author

rec3141 commented Mar 22, 2019

in case anyone else is working in R, here's how I imported it


vs.pred <- read.csv(virsorterfile,quote="",head=F)
vs.head <- read.table(virsorterfile,sep=",",quote="",head=T,comment="",skip=1,nrows=1)
colnames(vs.pred) <- colnames(vs.head)
colnames(vs.pred)[1] <- "vs.id"
vs.cats <- do.call(rbind,strsplit(x=as.character(vs.pred$vs.id[grep("category",vs.pred$vs.id)]),split=" - ",fixed=T))[,2]
vs.num <- grep("category",vs.pred$vs.id)
vs.pred$Category <- paste(c("",rep.int(vs.cats, c(vs.num[-1],nrow(vs.pred)) - vs.num)), vs.pred$Category)
vs.pred <- vs.pred[-grep("#",vs.pred$vs.id),]

vs.pred$node <- gsub(pattern="VIRSorter_",replacement="",x=vs.pred$vs.id)
vs.pred$node <- gsub(pattern="-circular",replacement="",x=vs.pred$node)
vs.pred$node <- gsub(pattern="cov_(\\d+)_",replacement="cov_\\1.",x=vs.pred$node,perl=F)

@simroux
Copy link
Owner

simroux commented Mar 25, 2019

Hi,

Thanks for the suggestion, and thanks a lot for sharing the R Code to import VirSorter results. Unfortunately, there is no support for VirSorter development anymore, so I can't commit on any timeframe by which these different issues may be fixed, but I have linked the R code in the Readme to help any user which would like to do the same type of import.

Best,
Simon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants