Inspecting number of PTMs with annotation information available

Inspecting number of PTMs with annotation information available#

As described in Running PTM-POSE section, PTM-POSE provides various options for annotating functional information for PTMs, coming from various databases. However, PTM functional information is inherently sparse, and so most annotations will only provide information on a handful of PTMs. For this reason, it can be useful to probe how many PTMsTo better understand the types of annotations that are available, as well as the number of PTMs that have an annotation of that type. This can be done using the analyze function in PTM-POSE.

Note: This examples assumes that you have already run the PTM-POSE pipeline and have at annotated PTMs with at least one layer of information.

from ptm_pose import helpers
from ptm_pose.analyze import annotations
import matplotlib.pyplot as plt

#load example differential inclusion data
spliced_ptms = helpers.load_example_data(spliced_ptms = True)
available_annotations = annotations.get_available_annotations(spliced_ptms)
available_annotations
Database Annotation Type Appended to PTM data?
0 iKiP Enzyme No
1 PhosphoSitePlus Enzyme No
2 PhosphoSitePlus Disease No
3 PhosphoSitePlus Interactions No
4 PhosphoSitePlus Function No
5 PhosphoSitePlus Process No
6 PhosphoSitePlus Perturbation No
7 PTMsigDB Perturbation-DIA No
8 PTMsigDB Perturbation-DIA2 No
9 PTMsigDB Pathway-NetPath No
10 PTMsigDB Pathway-WikiPathways No
11 PTMsigDB Perturbation-PRM No
12 PTMsigDB Pathway-BI No
13 DEPOD Enzyme No
14 RegPhos Enzyme No


You can also visualize the number of PTMs with annotation information available for each annotation type using a bar plot.

annotations.plot_available_annotations(spliced_ptms)
plt.tight_layout()
plt.show()
plot PTM annotations

As you can, see there are only a few PTMs from each annotation that have available information, with the most being 9 PTMs out of the 184 differentially included sites having been associated with a biological process. While this this should be taken into consideration when analyzing these annotations, we can glean some useful information and identify potentially interesting proteins/sites to dig deeper into. Let’s look at the PTMs that have been associated with a biological process:

ptms_with_annotation, annotation_counts = annotations.get_ptm_annotations(spliced_ptms, database = "PhosphoSitePlus", annotation_type = 'Process')
print('Specific PTMs with annotation:')
ptms_with_annotation
3 PTMs removed due to insignificant splice event (p < 0.05, dpsi >= 0.1): (33.33%)
Final number of PTMs to be assessed: 6
Specific PTMs with annotation:
Gene UniProtKB Accession Residue PTM Position in Isoform Modification Class PhosphoSitePlus:Function dPSI Significance Impact
0 CEACAM1 P13688 S 461.0 Phosphorylation activity, inhibited 0.525 1.73943268451e-09 Included
1 SPHK2 Q9NRA0 S 419.0 Phosphorylation intracellular localization 0.253 0.0129400018182 Included
2 SPHK2 Q9NRA0 S 421.0 Phosphorylation intracellular localization 0.253 0.0129400018182 Included
3 TSC2 P49815 S 981.0 Phosphorylation molecular association, regulation;intracellula... -0.219 4.18472157275e-05 Excluded
4 YAP1 P46937 K 342.0 Ubiquitination protein degradation -0.188;-0.161 4.17884655686e-07;0.000211254197372 Excluded


We can also look at the number of PTMs associated with each annotation:

print('Number of PTMs associated with each annotation:')
annotation_counts
Number of PTMs associated with each annotation:
All Impacted Included Excluded
PhosphoSitePlus:Function
intracellular localization 3 2.0 1
activity, inhibited 2 1.0 1
molecular association, regulation 1 0.0 1
protein degradation 1 0.0 1


Note: you could also do this analysis for altered flanking sequences by replacing spliced_ptms with altered_flanks in the above code.

Total running time of the script: (0 minutes 0.481 seconds)

Gallery generated by Sphinx-Gallery