Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export generating unexpected data #87

Open
arkhan19 opened this issue Nov 17, 2021 · 6 comments
Open

Export generating unexpected data #87

arkhan19 opened this issue Nov 17, 2021 · 6 comments

Comments

@arkhan19
Copy link

Paired gene export is giving different results than expected. I have selected both TRA and TRB in filters. What exactly is the function of this option, I am getting a different number of samples in either case.

What I am doing:

  1. Selected TRA and TRB in Filters
  2. Export all the data either paired, TRA or TRB.

What I am getting:
When Paired gene export is enabled:

TRB    42767
TRA    30002

When Paired gene export is disabled:

TRB    42658
TRA    29469

What is expected:
I am under the impression that there might be some data whose pair isn't available, and their population must be more than the data whose pair is available. Why am I getting more samples when the option is enabled? This issue is just based on intuition, i haven't checked the code.

@arkhan19 arkhan19 changed the title Export Issues Export generating unexpected data Nov 17, 2021
@bvdmitri
Copy link
Member

Hey! This might be expected as if you tick both TRA and TRB there still might be some entries that do not satisfy other filters. In contrast, paired gene export is ignoring any other filters you specified. If you are sure this is not the case, you can try to diff your results and give a bit of more context here so we can figure it out together.

@bvdmitri
Copy link
Member

@f3n1xx For example, does the output match for you if you go to the Meta filters panel and tick both Include non-canonical and Include unmapped V/J filter options?

@arkhan19
Copy link
Author

arkhan19 commented Nov 23, 2021

I have filtered mouse and human species with both TRA and TRB selected. No other filter were changed.

@arkhan19
Copy link
Author

@f3n1xx For example, does the output match for you if you go to the Meta filters panel and tick both Include non-canonical and Include unmapped V/J filter options?

my samples increased by 2 thousand

@bvdmitri
Copy link
Member

I have filtered mouse and human species with both TRA and TRB selected. No other filter were changed.

Yes, but there are some other filters that are enabled by default (as I mentioned, Include non-canonical as an example).

my samples increased by 2 thousand

Can you check if you still have different results with ticked Include non-canonical and Include unmapped V/J filter filters? In my case if I select TRA and TRB for mouse and human species and tick both Include non-canonical and Include unmapped V/J filter - Include paired export option does not make any difference in number of samples.

@bvdmitri
Copy link
Member

@f3n1xx By default vdjdb.cdr3.net does not show and does not return you spurious CDR3 sequences (unless you explicitly enable this), but "export paired" option ignores this setting and returns you all available data no matter spurious it or not.

So I would say what you observe is an expected behaviour and your exported data should have some extra entries that are not included by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants