Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using gfatools asm #30

Open
aafshinfard opened this issue Feb 17, 2023 · 0 comments
Open

Using gfatools asm #30

aafshinfard opened this issue Feb 17, 2023 · 0 comments

Comments

@aafshinfard
Copy link

aafshinfard commented Feb 17, 2023

Hi
Thank you for your work on gfatools,
I have an assembly graph that I want to process with gfatools asm (to simplify the graph, like popping the bubbles, etc) and output the scaffolds.
The graph is based on draft assembly contigs and some connections I inferred based on long reads, either edges or gaps based on if the estimated distance/gap size is negative or positive.
I wanted to get some information/advice on what I should provide gfatools asm with to get the best out of it. Like in my analysis, each connection (edge or gap) is based on supporting long reads so I have weights (number of supports) for the connections that may be useful, and I format that in the tag currently (FC:i: / see the example below) but not sure if gfatools will consider that?
I also have the gap size estimates that I wanted to output on G-lines for both gaps and edges (as I don't have alignments but only gap estimates for overlapping contigs too), but I found out negative distances, for overlapping contigs, are not allowed on G-lines, so I will have to somehow format that as an E-line? if so, then are the start/end positions important for the analysis or I can put some fake values?
Toy example:

H	VN:Z:2.0
graph [scaf_num=None]
S	1	49057	*
S	2	33803	*
S	3	22222	*
G	*	2-	1-	3340	*	FC:i:20
# 20 reads support the above gap and the gap size is 334
G	*	1+	3-	4000	*	FC:i:6
# 6 reads support the above gap and the gap size is 400
G	*	1-	2-	-300	        *	FC:i:15
# 15 reads support this connection and the contigs overlap by 300 bp, but this seems like an invalid G-line and should probably be converted to an E-line? 

And lastly: is there any additional information that gfatools can benefit from? I can potentially prepare and provide those too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant