Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agat_sp_filter_feature_by_attribute_value.pl should be able to filter by multiple values #426

Closed
Mitmischer opened this issue Feb 19, 2024 · 5 comments · Fixed by #464
Closed

Comments

@Mitmischer
Copy link

Is your feature request related to a problem? Please describe.
I have a set of genes and I want to filter the gff accordingly. agat_sp_filter_feature_by_attribute_value.pl only handles single comparisons so I could only filter one gene at a time.

Describe the solution you'd like
value should accept multiple values.

Describe alternatives you've considered
I could filter one gene at a time and then reassemble the file but that's cumbersome.

@Juke34
Copy link
Collaborator

Juke34 commented Feb 20, 2024

Could you provide an example of how you would like the command line to look like?

@Juke34
Copy link
Collaborator

Juke34 commented Apr 18, 2024

I do not get what you mean by " I could only filter one gene at a time"
agat_sp_filter_feature_by_attribute_value.pl can filter all the genes in accordance with the value constraint you set on a selected attribute.

@fuesseler
Copy link

Hello!
I think I ran into a similar problem as the OP with wanting to use mutliple values for this command.

I want to filter my GFF file according to multiple values of the attribute "gene_biotype", not just a single one.
So, something like this (which did not work, no filtering happened):
/usr/local/bin/agat_sp_filter_feature_by_attribute_value.pl --gff GCF_035594765.1_rAnoCar3.1.pri_genomic.agatfiltered.incomplgcm_pmstopcodons.gff -a gene_biotype --value lncRNA,rRNA,misc_RNA,tRNA,miRNA,snoRNA,ncRNA,snRNA,transcribed_pseudogene,V_segment --out GCF_035594765.1_rAnoCar3.1.pri_genomic.agatfiltered.agatfiltered_nopseudo.proteincoding.gff

Alternatively, a feature for "reverse" filtering (so keeping only the desired attribute and discarding all others) would also be nice. Then one could (in my example) filter for gene_biotype = proteincoding and discard the rest.

Only tangentially related, the Wiki https://agat.readthedocs.io/en/latest/tools/agat_sp_filter_feature_by_attribute_value.html still lists the command as "agat_sp_select_feature_by_attribute_value.pl" which caused me some confusion while trying to use it.

@Juke34
Copy link
Collaborator

Juke34 commented Apr 19, 2024

Thank you for your feedback @fuesseler !
Sounds doable

@fuesseler
Copy link

@Juke34 Awesome, looking forward to it :)

Juke34 added a commit that referenced this issue Jun 3, 2024
@Juke34 Juke34 mentioned this issue Jun 3, 2024
Merged
Juke34 added a commit that referenced this issue Jun 3, 2024
* fix mispelling script name in doc

* add string_sep_to_hash subroutine

* fix #426 add --value_insensitive param and possibility to use a list as input of --value param
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants