Identify protein interactions that may be impacted by splicing of PTMs#
Post translational modifications (PTMs) often facilitate protein interactions, either through direct binding of domains specific to that particular modification (e.g. SH2 domains binding to phosphorylated tyrosines) or through allosteric effects that change the conformation of the protein to either enhance or disrupt interactions. We provide functions to annotate spliced PTMs with relevant protein interactions and to identify key PTMs that may disrupt protein interaction networks.
Currently, we provide functions to process and analyze protein interaction data from PhosphoSitePlus, PTMInt, and PTMcode. We can also include enzyme-specific interactions (such as kinase substrate interactions through PhosphoSitePlus and RegPhos). First, we need to annotate the spliced PTMs with protein interactions (see rest of documentation for how to do this). Then, we can process the interactions across the different databases using the protein_interactions class to identify key PTMs that may disrupt protein interaction networks.
[1]:
from ptm_pose import analyze
import pandas as pd
# Load spliced ptm and altered flank data
spliced_ptms = pd.read_csv('spliced_ptms.csv')
interactions = analyze.protein_interactions(spliced_ptms)
interactions.get_interaction_network()
interactions.network_data.head()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[1], line 5
2 import pandas as pd
4 # Load spliced ptm and altered flank data
----> 5 spliced_ptms = pd.read_csv('spliced_ptms.csv')
7 interactions = analyze.protein_interactions(spliced_ptms)
8 interactions.get_interaction_network()
File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1026, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
1013 kwds_defaults = _refine_defaults_read(
1014 dialect,
1015 delimiter,
(...)
1022 dtype_backend=dtype_backend,
1023 )
1024 kwds.update(kwds_defaults)
-> 1026 return _read(filepath_or_buffer, kwds)
File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:620, in _read(filepath_or_buffer, kwds)
617 _validate_names(kwds.get("names", None))
619 # Create the parser.
--> 620 parser = TextFileReader(filepath_or_buffer, **kwds)
622 if chunksize or iterator:
623 return parser
File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1620, in TextFileReader.__init__(self, f, engine, **kwds)
1617 self.options["has_index_names"] = kwds["has_index_names"]
1619 self.handles: IOHandles | None = None
-> 1620 self._engine = self._make_engine(f, self.engine)
File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1880, in TextFileReader._make_engine(self, f, engine)
1878 if "b" not in mode:
1879 mode += "b"
-> 1880 self.handles = get_handle(
1881 f,
1882 mode,
1883 encoding=self.options.get("encoding", None),
1884 compression=self.options.get("compression", None),
1885 memory_map=self.options.get("memory_map", False),
1886 is_text=is_text,
1887 errors=self.options.get("encoding_errors", "strict"),
1888 storage_options=self.options.get("storage_options", None),
1889 )
1890 assert self.handles is not None
1891 f = self.handles.handle
File ~/miniconda3/envs/pose_doc/lib/python3.12/site-packages/pandas/io/common.py:873, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
868 elif isinstance(handle, str):
869 # Check whether the filename is to be opened in binary mode.
870 # Binary mode does not support 'encoding' and 'newline'.
871 if ioargs.encoding and "b" not in ioargs.mode:
872 # Encoding
--> 873 handle = open(
874 handle,
875 ioargs.mode,
876 encoding=ioargs.encoding,
877 errors=errors,
878 newline="",
879 )
880 else:
881 # Binary mode
882 handle = open(handle, ioargs.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'spliced_ptms.csv'
We can also calculate interaction stats to identify proteins that are most impacted or relevant to spliced PTMs and the protein interaction network
[2]:
interactions.get_interaction_stats()
interactions.network_stats.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[2], line 1
----> 1 interactions.get_interaction_stats()
3 interactions.network_stats.head()
NameError: name 'interactions' is not defined
If we want to focus on a specific protein, we can summarize information about a single protein in the network. In this case, let’s look at TSC2, which loses pS981 upon ESRP1 knockdown
[3]:
interactions.summarize_protein_network(protein = 'TSC2')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[3], line 1
----> 1 interactions.summarize_protein_network(protein = 'TSC2')
NameError: name 'interactions' is not defined
We can also visualize the network…
[4]:
interactions.plot_interaction_network(interacting_node_size = 10)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 1
----> 1 interactions.plot_interaction_network(interacting_node_size = 10)
NameError: name 'interactions' is not defined
…and the centrality of proteins in the network
[5]:
interactions.plot_network_centrality(centrality_measure='Degree')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 interactions.plot_network_centrality(centrality_measure='Degree')
NameError: name 'interactions' is not defined