CIAA: Integrated Proteomics and Structural Modeling for Understanding Cysteine Reactivity with Iodoacetamide Alkyne
Cysteine residues play key roles in protein structure and function and can serve as targets for chemical probes and even drugs. Chemoproteomic studies have revealed that heightened cysteine reactivity towards electrophilic probes, such as iodoacetamide alkyne (IAA), is indicative of likely residue functionality. However, while the cysteine coverage of chemoproteomic studies has increased substantially, these methods still only provide a partial assessment of proteome-wide cysteine reactivity, with cysteines from low abundance proteins and tough-to-detect peptides still largely refractory to chemoproteomic analysis. Here we integrate cysteine chemoproteomic reactivity datasets with structure-guided computational analysis to delineate key structural features of proteins that favor elevated cysteine reactivity towards IAA.
https://chemrxiv.org/engage/chemrxiv/article-details/678722edfa469535b90fe695
data
: directory that contains all the csv filesisotop_pdb.csv
: input raw datasetisotop_training_set.csv
: input training datasetisotop_test_set.csv
: input test datasetisotop_alphafold_pdb_set.csv
: input alphafold validation datasetisotop_alphafold_denovo_set.csv
: input alphafold validation dataset
isotop_residue_function.csv
: input for isotop_residue_function_barplot.ipynblist_found_peptides.csv
: input for isotop_pdb_filtering.ipynbfinal_selection.csv
: input for isotop_pdb_filtering.ipynbisotop_training_nonredundant_complete_final_identifiers.csv
: input for isotop_pdb_filtering.ipynbisotop_pdb_meric_state.csv
: input for isotop_pdb_filtering.ipynbisotop_descriptors.csv
: input dataset for isotop_descriptor_pearson_correlations.ipynb
make_dataset.py
: Prepare PDB structure list using input filesdownload_pdbs.py
: Download PDB files from input pdb_files.txtreduce_pdbs.sh
: Clean PDBs of heteroatoms, ligands, and add hydrogensget_neighbor_graphs.py
: Find neighbors of cysteines in PDB structure listget_simple_descriptors.py
: Calculate raw structural descriptors of cysteinesget_simple_descriptors.py
: One hot encode raw structural descriptors of cysteines
isotop_residue_function_barplot.ipynb
: Analysis for classifying cysteines based on functionisotop_pdb_filtering.ipynb
: Analysis for filtering PDB structuresisotop_descriptor_pearson_correlations.ipynb
: Analysis for computing pearson correlation coefficientsciaa_hyperreactive_cysteines_model.ipynb
: Analysis for developing the CIAA model based on input files
ciaa_results
: directory that contains all the csv files and png images from the modelciaa_rf_classifier_model.pkl
: output from ciaa_hyperreactive_cysteines_model.ipynb
- Python 3
- Numpy
- Scipy
- Pandas
- Matplotlib
- Scikit-learn (ML models)
- MDAnalysis (used for calculating HB)
- Prody (for building biological units; http://www.bahargroup.org/prody/manual/getprody.html)
- Modeller (for building incomplete sidechains)
- Reduce (for adding hydrogens; http://kinemage.biochem.duke.edu/software/reduce/)
- DSSP (for SS & SASA of cysteines; https://swift.cmbi.umcn.nl/gv/dssp/index.html)
- propka
conda create -n cysteine_reactivity python=3
conda activate cysteine_reactivity
conda install -c conda-forge -c salilab numpy scipy pandas matplotlib scikit-learn matplotlib \
pymol-open-source mdanalysis modeller propka notebook
python3 scripts/make_dataset.py
Create input file with a list of PDB accessions called pdb_files.txt and store in the data folder (example: data/pdb_files.txt)
python3 scripts/download_pdbs.py -i 'pdb_files.txt'
bash ../scripts/reduce_pdbs.sh
python3 scripts/get_neighbor_graphs.py
Update /Users/{user_name}/anaconda3/bin/mkdssp to access local installation of dssp
python3 scripts/get_simple_descriptors.py
python3 scripts/get_simple_descriptors.py
Step 7: Map PDB Residues to SIFTS (optional)
Boatner, L., Eberhardt, J., Shikwana, F., Holcomb, M., Lee, P., Houk, K., Forli, S. & Backus, K. (2025). CIAA: Integrated Proteomics and Structural Modeling for Understanding Cysteine Reactivity with Iodoacetamide Alkyne.