-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Aggregate the individual GEX runs into a single AnnData object #52
Comments
Hi there, thanks for trying conga, and thanks for the feedback. This error suggests that the list "all_data" is empty, which may be because the preceding loop did not execute. The loop was over the files found by the glob command
Could you check and see whether the expected files are present and in the directory where the notebook is running? These would be the *-CD3 folders that have the GEX counts data in them. |
Thank you for your help! I have solved this error by changing the reading directory: gex_datasets = sorted(glob.glob('/home/shpc_100668/conga/GSE144469_RAW/*-CD3')) My command : gex_datasets = sorted(glob.glob('/home/shpc_100668/conga/GSE144469_RAW/*-CD3')) error:
|
Hello,
conga is a wonderful tool!
I ran into an issue with explore fancy_conga_pipeline_with_batches_and_gammadelta_tcrs notebook.
My command : gex_datasets = sorted(glob.glob('*-CD3'))
diseases = ['C','NC','CT'] # colitis, no-colitis, healthy control
contigs_file = '/home/shpc_100668/conga/GSE144469_RAW/GSE144469_TCR_filtered_contig_annotations_all.csv'
all_contigs = pd.read_csv(contigs_file)
all_data = []
for donor_num, gex_dir in enumerate(gex_datasets):
# The folder name is also the donor ID
donor = gex_dir.split('-')[0]
donor_contigs = all_contigs[all_contigs.barcode.str.endswith(donor)].copy()
# change the barcode suffix to '-1' to match the GEX data
donor_contigs['barcode'] = donor_contigs.barcode.str.split('-').str.get(0)+'-1'
donor_contigs_file = f'{donor}_abtcr_filtered_contigs.csv'
donor_contigs.to_csv(donor_contigs_file)
# process the contigs to generate conga clonotypes
donor_clones_file = f'{donor}_abtcr_clones.tsv'
make_10x_clones_file(
donor_contigs_file,
organism = 'human', # using 'human' for TCRab
clones_file = donor_clones_file,
stringent = True, # (the default) see Note #1 on clonotype filtering
)
# read the GEX data and the clonotypes into CoNGA
adata = conga.preprocess.read_dataset(
gex_dir, '10x_mtx', donor_clones_file,
allow_missing_kpca_file=True)
disease = donor[:-1]
adata.obs['disease'] = disease
adata.obs['disease_int'] = diseases.index(disease) # conga batch ids are integers
adata.obs['donor'] = donor
adata.obs['donor_int'] = donor_num # conga batch ids are integers
all_data.append( adata )
new_adata = all_data[0].concatenate(all_data[1:])
new_adata.write('merged_gex_abtcr.h5ad')
Error: IndexError Traceback (most recent call last)
/tmp/ipykernel_1354605/1967687937.py in
33
34 # concatenate the datasets
---> 35 new_adata = all_data[0].concatenate(all_data[1:])
36 #save the aggregate AnnData object
37 new_adata.write('merged_gex_abtcr.h5ad')
IndexError: list index out of range
I'm really at a loss as to how to proceed, and any guidance would be much appreciated!
Thank you for your kind help!
The text was updated successfully, but these errors were encountered: