Frequently Asked Questions#
If you do not see your question here, feel free to ask your question with this form
Data sources#
Q: Where does the information in the ptm_coordinates file come from? Do I need to generate this?
The ptm_coordinates file is generated by combining PTM information from ProteomeScout and PhosphoSitePlus and transcriptomic information in Ensembl. Most PTMs are based on mass-spectrometry based measurements across literature, with a select few being from more low throughput experiments (all are based on experiments and are not predicted).
Importantly, we have included the ptm_coordinates file in the package itself, so there is no need for you to generate this on your own.
Q: Is any filtering done to remove PTMs that may be present in proteomic databases but are the result of noise from MS experiments
No, all PTMs recorded in ProteomeScout or PhosphoSitePlus are included in this dataset. However, should you choose, there are options to filter PTMs based on their overall evidence (see Filtering PTMs section of documentation)
Data Preparation#
Q: What type of data can be used with PTM-POSE?
A: Any splicing quantification tools that return the chromosome, DNA strand, and start and stop regions of a given splice event can use PTM-POSE. We have tested this approach predominantly using short-read RNA sequencing analysis from tools like MATS, MAJIQ, and SpliceSeq tools, but PTM-POSE does not require any special dataset.
Q: Are specific quantification values required for use with PTM-POSE?
A: No, quantification, such as a delta PSI between two experimental groups is not needed for PTM-POSE. However, it can be useful to include this information and indicate using the dPSI_col parameter when running PTM-POSE, as this provides more flexibilty for downstream analysis.
Interpreting Results#
Q: Why do some PTMs have multiple entries and/or PSI values in the output?
A: Depending on the quantification tool, you might find cases where there are multiple events related to the same or similar region. PTM-POSE does not filter these out, as it relies on the splicing quantification tool and user to determine relevant splice events. We generally recommend removing PTM sites for which there are multiple conflicting events (different directional change in PSI), as these may be misleading or a consequence of noise in the data.
Q: What criteria is recommended to identify PTMs most likely to be impacted by splicing?
It is somewhat tool dependent, and is still something we are exploring, but generally speaking we have found the following guidelines to be most useful:
Focus on PTMs with delta PSI values greater than 20%, or the top 15% of all changes in your data. This will give changes that are most likely to be measurable at the protein level.
Focus on PTMs that have been shown to be spliced across isoforms previously. If you include
remove_novel = True
in most functions with ptm-pose, it will automatically filter PTMs that have not been previously seen to be spliced.If provided by the splicing quantification tool you use, it’s also useful to check the total junction counts attributed to both the inclusion and exclusion forms. If these numbers are smaller, you may want to be more cautious.