Explore features of a protein of interest#
ProteomeScoutAPI allows for easy access of features associated with any protein of interest. Let’s say, for example, we wanted to better understand the PTMs, domains, and other features associated with the tyrosine kinase SRC (UniProt ID = P12931).
First, initialize the API:
[4]:
import proteomeScoutAPI
api = proteomeScoutAPI.ProteomeScoutAPI()
Let’s first get an idea of the structure of SRC. What domains are present in the protein, focusing on interpro definitions?
[7]:
api.get_domains('P12931', domain_type = 'interpro')
[7]:
[('SH3_domain', '84', '145', 'IPR001452'),
('SH2', '149', '248', 'IPR000980'),
('Prot_kinase_dom', '270', '523', 'IPR000719')]
Here, we can see there are three domains in SRC, provided as a list of tuples. Each tuple contains the domain name, start position in the protein, end position in the protein, and interpro ID.
We can easily convert this list to a dataframe if you prefer:
[10]:
import pandas as pd
domains = api.get_domains('P12931', domain_type = 'interpro')
domains = pd.DataFrame(domains, columns = ['Domain_Name', 'Start', 'End', 'InterPro_ID'])
domains
[10]:
| Domain_Name | Start | End | InterPro_ID | |
|---|---|---|---|---|
| 0 | SH3_domain | 84 | 145 | IPR001452 |
| 1 | SH2 | 149 | 248 | IPR000980 |
| 2 | Prot_kinase_dom | 270 | 523 | IPR000719 |
If want to also get the structure, PTMs, or other details about the protein, we can repeat the same as above, but using api.get_structures(), api.get_macro_molecular(), or api.get_PTMs().
In some cases, you might also want to get PTMs AND information about where each PTM is located in the protein. We have provided an additional function for quickly getting this info (api.get_annotated_ptms()). This will output a dataframe containing the standard information about the PTM (position, residue, and type of modification), but will also indicate whether the PTM is associated with a specific domain, structure, or macromolecular instance.
[11]:
api.get_annotated_PTMs('P12931')
[11]:
| Position | Residue | Modification_Type | evidence | Domain_Names_InterPro | InterPro_IDs | Domain_Names_UniProt | Structures | Macro_Molecular_Structures | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | G | N-myristoyl glycine | 1395,1575 | Disordered | ||||
| 1 | 12 | S | Phosphoserine | 1323,1395,1575,1718,1719,1790 | Disordered;Basic and acidic residues | ||||
| 2 | 17 | S | Phosphoserine | 1886,1323,1395,1575,1718,1723,1790,1803 | Disordered;Basic and acidic residues | ||||
| 3 | 35 | S | Phosphoserine | 1575,1790,1803 | Disordered | ||||
| 4 | 37 | T | Phosphothreonine | 1575,1790 | Disordered | ||||
| 5 | 39 | S | Phosphoserine | 1575,1790 | Disordered | ||||
| 6 | 43 | S | Phosphoserine | 1790 | Disordered | ||||
| 7 | 51 | S | Phosphoserine | 1790 | Disordered | ||||
| 8 | 62 | K | N6-acetyllysine | 1790 | |||||
| 9 | 69 | S | Phosphoserine | 1323,1575,1718,1723,1790,1803 | |||||
| 10 | 70 | S | Phosphoserine | 1575,1790 | |||||
| 11 | 72 | T | Phosphothreonine | 1575,1790 | |||||
| 12 | 74 | T | Phosphothreonine | 1323,1395,1575,1790,1803 | |||||
| 13 | 75 | S | Phosphoserine | 1886,1323,1395,1575,1718,1719,1723,1790,1803 | |||||
| 14 | 93 | Y | Phosphotyrosine | 1790 | SH3_domain | IPR001452 | SH3 | STRAND | |
| 15 | 97 | S | Phosphoserine | 1323,1395,1575,1790 | SH3_domain | IPR001452 | SH3 | ||
| 16 | 104 | S | Phosphoserine | 1723,1790 | SH3_domain | IPR001452 | SH3 | ||
| 17 | 182 | T | Phosphothreonine | 1790 | SH2 | IPR000980 | SH2 | STRAND | |
| 18 | 183 | T | Phosphothreonine | 1790 | SH2 | IPR000980 | SH2 | STRAND | |
| 19 | 187 | Y | Phosphotyrosine | 1886,1521,1575,1787,1790,1803 | SH2 | IPR000980 | SH2 | STRAND | |
| 20 | 216 | Y | Phosphotyrosine | 1323,1395,1575,1790 | SH2 | IPR000980 | SH2 | STRAND | |
| 21 | 246 | T | Phosphothreonine | 1790 | SH2 | IPR000980 | SH2 | ||
| 22 | 250 | T | Phosphothreonine | 1790 | |||||
| 23 | 255 | T | Phosphothreonine | 1790 | |||||
| 24 | 269 | S | Phosphoserine | 1790 | HELIX | ||||
| 25 | 304 | T | Phosphothreonine | 1790 | Prot_kinase_dom | IPR000719 | Protein kinase | TURN | |
| 26 | 306 | S | Phosphoserine | 1395,1575,1790 | Prot_kinase_dom | IPR000719 | Protein kinase | ||
| 27 | 338 | Y | Phosphotyrosine | 1323,1575,1790 | Prot_kinase_dom | IPR000719 | Protein kinase | STRAND | |
| 28 | 348 | S | Phosphoserine | 1790 | Prot_kinase_dom | IPR000719 | Protein kinase | STRAND | |
| 29 | 385 | Y | Phosphotyrosine | 1790 | Prot_kinase_dom | IPR000719 | Protein kinase | ||
| 30 | 419 | Y | Phosphotyrosine | 1886,1323,1395,1575,1718,1719,1790,1803 | Prot_kinase_dom | IPR000719 | Protein kinase | HELIX | |
| 31 | 420 | T | Phosphothreonine | 1575,1790 | Prot_kinase_dom | IPR000719 | Protein kinase | HELIX | |
| 32 | 430 | K | Ubiquitination | 1790 | Prot_kinase_dom | IPR000719 | Protein kinase | HELIX | |
| 33 | 439 | Y | Phosphotyrosine | 1323,1575,1790,1803 | Prot_kinase_dom | IPR000719 | Protein kinase | HELIX | |
| 34 | 511 | T | Phosphothreonine | 1575,1803 | Prot_kinase_dom | IPR000719 | Protein kinase | ||
| 35 | 522 | Y | Phosphotyrosine | 1575,1790,1803 | Prot_kinase_dom | IPR000719 | Protein kinase | TURN | |
| 36 | 525 | S | Phosphoserine | 1575,1790 | |||||
| 37 | 526 | T | Phosphothreonine | 1790 | |||||
| 38 | 530 | Y | Phosphotyrosine | 1323,1395,1575,1787,1790,1803,1886 |