Assessing enriched PTM functions

Assessing enriched PTM functions#

As described in Running PTM-POSE section, PTM-POSE provides various options for annotating functional information for PTMs, coming from various databases. Often, we will want to dig deeper into the specific functions, processes, interactions, etc. associated with the proteins in our dataset. First, we can look at the annotations currently available for analysis, based on annotations that have been appended using the annotate module:

from ptm_pose import helpers
from ptm_pose.analyze import annotations
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

spliced_ptms, altered_flanks = helpers.load_example_data(spliced_ptms = True, altered_flanks = True)


annot_categories = annotations.get_available_annotations(spliced_ptms)
annot_categories
Database Annotation Type Appended to PTM data?
0 iKiP Enzyme No
1 PhosphoSitePlus Enzyme No
2 PhosphoSitePlus Disease No
3 PhosphoSitePlus Interactions No
4 PhosphoSitePlus Function No
5 PhosphoSitePlus Process No
6 PhosphoSitePlus Perturbation No
7 PTMsigDB Perturbation-DIA No
8 PTMsigDB Perturbation-DIA2 No
9 PTMsigDB Pathway-NetPath No
10 PTMsigDB Pathway-WikiPathways No
11 PTMsigDB Perturbation-PRM No
12 PTMsigDB Pathway-BI No
13 DEPOD Enzyme No
14 RegPhos Enzyme No


This will tell us what database information is available and the types of information from that database. Let’s take a closer look at the biological process information from PhosphoSitePlus:

ptms_with_annotation, annotation_counts = annotations.get_ptm_annotations(spliced_ptms, database = "PhosphoSitePlus", annotation_type = 'Process')
print('Specific PTMs with annotation:')
ptms_with_annotation
3 PTMs removed due to insignificant splice event (p < 0.05, dpsi >= 0.1): (33.33%)
Final number of PTMs to be assessed: 6
Specific PTMs with annotation:
Gene UniProtKB Accession Residue PTM Position in Isoform Modification Class PhosphoSitePlus:Function dPSI Significance Impact
0 CEACAM1 P13688 S 461.0 Phosphorylation activity, inhibited 0.525 1.73943268451e-09 Included
1 SPHK2 Q9NRA0 S 419.0 Phosphorylation intracellular localization 0.253 0.0129400018182 Included
2 SPHK2 Q9NRA0 S 421.0 Phosphorylation intracellular localization 0.253 0.0129400018182 Included
3 TSC2 P49815 S 981.0 Phosphorylation activity, inhibited;intracellular localization... -0.219 4.18472157275e-05 Excluded
4 YAP1 P46937 K 342.0 Ubiquitination protein degradation -0.188;-0.161 0.000211254197372;4.17884655686e-07 Excluded


From this, we note a total of 9 impacted PTMs from 7 genes that have biological process information available. While we could manually look through to look for common processes, we can also inspect the annotation counts object to see the most common processes, including a breakdown by the type of impact (included [dPSI > 0], excluded [dPSI < 0], or altered flanking sequence). We can also look at the number of PTMs associated with each annotation:

print('Number of PTMs associated with each annotation:')
annotation_counts
Number of PTMs associated with each annotation:
All Impacted Included Excluded
PhosphoSitePlus:Function
intracellular localization 3 2.0 1
activity, inhibited 2 1.0 1
molecular association, regulation 1 0.0 1
protein degradation 1 0.0 1


To better visualize the enriched functions, we can also plot the number of PTMs associated with each function:

#plot annotatinos when not collapsing them
fig, ax = plt.subplots(ncols = 2, figsize = (6, 3))
fig.subplots_adjust(wspace = 2)
annotations.plot_annotation_counts(spliced_ptms =spliced_ptms, altered_flanks = altered_flanks, ax = ax[0], collapse_on_similar = False, database = 'PhosphoSitePlus', annot_type = 'Process', top_terms = 10)
ax[0].set_title('Full Annotation')
#plot annotations when collapsing them
annotations.plot_annotation_counts(altered_flanks = altered_flanks, ax = ax[1], collapse_on_similar = True, database = 'PhosphoSitePlus', annot_type = 'Process', top_terms = 10)
ax[1].set_title('Collapsed Annotation')
Full Annotation, Collapsed Annotation
Text(0.5, 1.0, 'Collapsed Annotation')

We can also perform enrichment analysis to identify if any of the functions are more likely to be present then expected, using collapsed annotations to increase likelihood of finding enriched terms:

enrichment = annotations.annotation_enrichment(spliced_ptms, database = 'PhosphoSitePlus', annotation_type = 'Function', collapse_on_similar=True)
enrichment.head()
Fraction Impacted p-value Adjusted p-value PTM
PhosphoSitePlus:Function
intracellular localization 3/3006 0.394168 1.0 SPHK2_S419;SPHK2_S421;TSC2_S981
protein degradation 1/1645 0.710863 1.0 YAP1_K342
activity NaN NaN NaN CEACAM1_S461;TSC2_S981
molecular association NaN NaN NaN TSC2_S981


Total running time of the script: (0 minutes 0.462 seconds)

Gallery generated by Sphinx-Gallery