Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRUST4 module #316

Closed
wants to merge 0 commits into from
Closed

TRUST4 module #316

wants to merge 0 commits into from

Conversation

mapo9
Copy link
Contributor

@mapo9 mapo9 commented Mar 25, 2024

This PR adds the TRUST4 module in a new subworkflow.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/airrflow branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@mapo9 mapo9 requested a review from ggabernet March 25, 2024 08:28
@mapo9 mapo9 marked this pull request as draft March 25, 2024 08:28
@mapo9
Copy link
Contributor Author

mapo9 commented Mar 25, 2024

Currently we have the problem that (for single cell data), the workflow fails at the SINGLE_CELL_QC_AND_FILTERING:SINGLE_CELL_QC step with the following error

  Quitting from lines 210-213 [removeDoublets] (_main.Rmd)
  Error in removeDoublets():
  ! The column cell_id contains no data
  Backtrace:
   1. enchantr::removeDoublets(...)
  Warning messages:
  1: replacing previous import 'data.table::first' by 'dplyr::first' when loading 'enchantr' 
  2: replacing previous import 'data.table::last' by 'dplyr::last' when loading 'enchantr' 
  3: replacing previous import 'data.table::between' by 'dplyr::between' when loading 'enchantr' 
  Execution halted

Work dir:
  /home/kymmp01/workdir/pipeline_dev/trust42airrflow/airrflow/test_flow/work/40/5d32e7d034d6123f08d2cc50bda49f

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

 -- Check '.nextflow.log' file for details
ERROR ~ Corruption: 
        Descriptor does not contain a meta-nextfile entry
        Descriptor does not contain a meta-lognumber entry
        Descriptor does not contain a last-sequence-number entry

 -- Check '.nextflow.log' file for details

I am pretty sure this happens because ! The column cell_id contains no data.
Currently, I am using the OUT_airr.tsv to feed the TRUST4 results into the immcantation workflow.
Here, each sequence is one row.

TRUST4 also provides a out_barcode_airr.tsv for single cell data where each cell_id is one row.
So basically this gives clonotypes for a barcode with "consensus_count" being the number of reads/UMI supporting this contig. Obviously, the barcode_airr is a lot larger and contains many redundant junctions when different cells present the same clone.

I think we have 2 options now:

  1. use barcode_airr.tsv for analysis
  2. use the OUT_airr.tsv and give the "sequence_id" as "cell_id"

The discussion with the TRUST4 developers about this can be found here.

This might also help understand:
Screenshot 2024-03-22 at 16 45 28

@nf-core nf-core deleted a comment from github-actions bot Apr 11, 2024
Copy link

github-actions bot commented Apr 11, 2024

nf-core lint overall result: Failed ❌

Posted for pipeline commit 6f9e5af

+| ✅ 206 tests passed       |+
#| ❔   6 tests were ignored |#
!| ❗   1 tests had warnings |!
-| ❌   1 tests failed       |-

❌ Test failures:

  • nextflow_config - Config default value incorrect: params.umi_position is set as R1 in nextflow_schema.json but is null in nextflow.config.

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-04-12 08:11:43

@mapo9
Copy link
Contributor Author

mapo9 commented Apr 11, 2024

I worked some more on this issue and tested the barcode_airr file now as well. This contains cell_ids and the Immcantation frameworks in airrflow runs through fine using this file.
I am adding the test airr files here as well.

Here you can also find the respective airrflow results.

It would be great, if you could have a look whether the results look reasonable for you like this or if we need to find another solution for the integration of TRUST4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant