Identify protein interactions that may be impacted by splicing of PTMs#

Post translational modifications (PTMs) often facilitate protein interactions, either through direct binding of domains specific to that particular modification (e.g. SH2 domains binding to phosphorylated tyrosines) or through allosteric effects that change the conformation of the protein to either enhance or disrupt interactions. We provide functions to annotate spliced PTMs with relevant protein interactions and to identify key PTMs that may disrupt protein interaction networks.

Currently, we provide functions to process and analyze protein interaction data from PhosphoSitePlus, PTMInt, and PTMcode. We can also include enzyme-specific interactions (such as kinase substrate interactions through PhosphoSitePlus and RegPhos). First, we need to annotate the spliced PTMs with protein interactions (see rest of documentation for how to do this). Then, we can process the interactions across the different databases using the protein_interactions class to identify key PTMs that may disrupt protein interaction networks.

[1]:

from ptm_pose import analyze
import pandas as pd

# Load spliced ptm and altered flank data
spliced_ptms = pd.read_csv('spliced_ptms.csv')

interactions = analyze.protein_interactions(spliced_ptms)
interactions.get_interaction_network()

interactions.network_data.head()

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[1], line 5
      2 import pandas as pd
      4 # Load spliced ptm and altered flank data
----> 5 spliced_ptms = pd.read_csv('spliced_ptms.csv')
      7 interactions = analyze.protein_interactions(spliced_ptms)
      8 interactions.get_interaction_network()

File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1026, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
   1013 kwds_defaults = _refine_defaults_read(
   1014     dialect,
   1015     delimiter,
   (...)
   1022     dtype_backend=dtype_backend,
   1023 )
   1024 kwds.update(kwds_defaults)
-> 1026 return _read(filepath_or_buffer, kwds)

File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:620, in _read(filepath_or_buffer, kwds)
    617 _validate_names(kwds.get("names", None))
    619 # Create the parser.
--> 620 parser = TextFileReader(filepath_or_buffer, **kwds)
    622 if chunksize or iterator:
    623     return parser

File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1620, in TextFileReader.__init__(self, f, engine, **kwds)
   1617     self.options["has_index_names"] = kwds["has_index_names"]
   1619 self.handles: IOHandles | None = None
-> 1620 self._engine = self._make_engine(f, self.engine)

File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1880, in TextFileReader._make_engine(self, f, engine)
   1878     if "b" not in mode:
   1879         mode += "b"
-> 1880 self.handles = get_handle(
   1881     f,
   1882     mode,
   1883     encoding=self.options.get("encoding", None),
   1884     compression=self.options.get("compression", None),
   1885     memory_map=self.options.get("memory_map", False),
   1886     is_text=is_text,
   1887     errors=self.options.get("encoding_errors", "strict"),
   1888     storage_options=self.options.get("storage_options", None),
   1889 )
   1890 assert self.handles is not None
   1891 f = self.handles.handle

File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/common.py:873, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
    868 elif isinstance(handle, str):
    869     # Check whether the filename is to be opened in binary mode.
    870     # Binary mode does not support 'encoding' and 'newline'.
    871     if ioargs.encoding and "b" not in ioargs.mode:
    872         # Encoding
--> 873         handle = open(
    874             handle,
    875             ioargs.mode,
    876             encoding=ioargs.encoding,
    877             errors=errors,
    878             newline="",
    879         )
    880     else:
    881         # Binary mode
    882         handle = open(handle, ioargs.mode)

FileNotFoundError: [Errno 2] No such file or directory: 'spliced_ptms.csv'

We can also calculate interaction stats to identify proteins that are most impacted or relevant to spliced PTMs and the protein interaction network

[2]:

interactions.get_interaction_stats()

interactions.network_stats.head()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 interactions.get_interaction_stats()
      3 interactions.network_stats.head()

NameError: name 'interactions' is not defined

If we want to focus on a specific protein, we can summarize information about a single protein in the network. In this case, let’s look at TSC2, which loses pS981 upon ESRP1 knockdown

[3]:

interactions.summarize_protein_network(protein = 'TSC2')

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 interactions.summarize_protein_network(protein = 'TSC2')

NameError: name 'interactions' is not defined

We can also visualize the network…

[4]:

interactions.plot_interaction_network(interacting_node_size = 10)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 interactions.plot_interaction_network(interacting_node_size = 10)

NameError: name 'interactions' is not defined

…and the centrality of proteins in the network

[5]:

interactions.plot_network_centrality(centrality_measure='Degree')

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 interactions.plot_network_centrality(centrality_measure='Degree')

NameError: name 'interactions' is not defined