Explore features of a protein of interest

Explore features of a protein of interest#

ProteomeScoutAPI allows for easy access of features associated with any protein of interest. Let’s say, for example, we wanted to better understand the PTMs, domains, and other features associated with the tyrosine kinase SRC (UniProt ID = P12931).

First, initialize the API:

[4]:
import proteomeScoutAPI
api = proteomeScoutAPI.ProteomeScoutAPI()

Let’s first get an idea of the structure of SRC. What domains are present in the protein, focusing on interpro definitions?

[7]:
api.get_domains('P12931', domain_type = 'interpro')
[7]:
[('SH3_domain', '84', '145', 'IPR001452'),
 ('SH2', '149', '248', 'IPR000980'),
 ('Prot_kinase_dom', '270', '523', 'IPR000719')]

Here, we can see there are three domains in SRC, provided as a list of tuples. Each tuple contains the domain name, start position in the protein, end position in the protein, and interpro ID.

We can easily convert this list to a dataframe if you prefer:

[10]:
import pandas as pd
domains = api.get_domains('P12931', domain_type = 'interpro')
domains = pd.DataFrame(domains, columns = ['Domain_Name', 'Start', 'End', 'InterPro_ID'])
domains
[10]:
Domain_Name Start End InterPro_ID
0 SH3_domain 84 145 IPR001452
1 SH2 149 248 IPR000980
2 Prot_kinase_dom 270 523 IPR000719

If want to also get the structure, PTMs, or other details about the protein, we can repeat the same as above, but using api.get_structures(), api.get_macro_molecular(), or api.get_PTMs().

In some cases, you might also want to get PTMs AND information about where each PTM is located in the protein. We have provided an additional function for quickly getting this info (api.get_annotated_ptms()). This will output a dataframe containing the standard information about the PTM (position, residue, and type of modification), but will also indicate whether the PTM is associated with a specific domain, structure, or macromolecular instance.

[11]:
api.get_annotated_PTMs('P12931')
[11]:
Position Residue Modification_Type evidence Domain_Names_InterPro InterPro_IDs Domain_Names_UniProt Structures Macro_Molecular_Structures
0 2 G N-myristoyl glycine 1395,1575 Disordered
1 12 S Phosphoserine 1323,1395,1575,1718,1719,1790 Disordered;Basic and acidic residues
2 17 S Phosphoserine 1886,1323,1395,1575,1718,1723,1790,1803 Disordered;Basic and acidic residues
3 35 S Phosphoserine 1575,1790,1803 Disordered
4 37 T Phosphothreonine 1575,1790 Disordered
5 39 S Phosphoserine 1575,1790 Disordered
6 43 S Phosphoserine 1790 Disordered
7 51 S Phosphoserine 1790 Disordered
8 62 K N6-acetyllysine 1790
9 69 S Phosphoserine 1323,1575,1718,1723,1790,1803
10 70 S Phosphoserine 1575,1790
11 72 T Phosphothreonine 1575,1790
12 74 T Phosphothreonine 1323,1395,1575,1790,1803
13 75 S Phosphoserine 1886,1323,1395,1575,1718,1719,1723,1790,1803
14 93 Y Phosphotyrosine 1790 SH3_domain IPR001452 SH3 STRAND
15 97 S Phosphoserine 1323,1395,1575,1790 SH3_domain IPR001452 SH3
16 104 S Phosphoserine 1723,1790 SH3_domain IPR001452 SH3
17 182 T Phosphothreonine 1790 SH2 IPR000980 SH2 STRAND
18 183 T Phosphothreonine 1790 SH2 IPR000980 SH2 STRAND
19 187 Y Phosphotyrosine 1886,1521,1575,1787,1790,1803 SH2 IPR000980 SH2 STRAND
20 216 Y Phosphotyrosine 1323,1395,1575,1790 SH2 IPR000980 SH2 STRAND
21 246 T Phosphothreonine 1790 SH2 IPR000980 SH2
22 250 T Phosphothreonine 1790
23 255 T Phosphothreonine 1790
24 269 S Phosphoserine 1790 HELIX
25 304 T Phosphothreonine 1790 Prot_kinase_dom IPR000719 Protein kinase TURN
26 306 S Phosphoserine 1395,1575,1790 Prot_kinase_dom IPR000719 Protein kinase
27 338 Y Phosphotyrosine 1323,1575,1790 Prot_kinase_dom IPR000719 Protein kinase STRAND
28 348 S Phosphoserine 1790 Prot_kinase_dom IPR000719 Protein kinase STRAND
29 385 Y Phosphotyrosine 1790 Prot_kinase_dom IPR000719 Protein kinase
30 419 Y Phosphotyrosine 1886,1323,1395,1575,1718,1719,1790,1803 Prot_kinase_dom IPR000719 Protein kinase HELIX
31 420 T Phosphothreonine 1575,1790 Prot_kinase_dom IPR000719 Protein kinase HELIX
32 430 K Ubiquitination 1790 Prot_kinase_dom IPR000719 Protein kinase HELIX
33 439 Y Phosphotyrosine 1323,1575,1790,1803 Prot_kinase_dom IPR000719 Protein kinase HELIX
34 511 T Phosphothreonine 1575,1803 Prot_kinase_dom IPR000719 Protein kinase
35 522 Y Phosphotyrosine 1575,1790,1803 Prot_kinase_dom IPR000719 Protein kinase TURN
36 525 S Phosphoserine 1575,1790
37 526 T Phosphothreonine 1790
38 530 Y Phosphotyrosine 1323,1395,1575,1787,1790,1803,1886