DANSy

Overview

Domain Architecture Network Syntax (DANSy) is an analytical workflow that applies n-gram analysis with network theory to protein domain architectures. This linguistic network approach abstracts the functional connections between proteins to analyze either entire collections of proteins (DANSy) or collections defined by changes in expression (deDANSy). DANSy requires reference files to be generated by CoDIAC to ensure a controlled vocabulary of domains and domain boundaries are fetched from InterPro.

In DANSy, we extract each domain n-gram from a protein domain architecture and represent them as nodes on a network. If one domain n-gram is found in another, an edge is drawn between those n-gram nodes in the network. This network can then be analyzed with additional tools as shown in our Examples.

For deDANSy, the base DANSy analysis is extended to include differential expression analysis results to associate domain n-grams with either up- or down-regulated expression. The network is then analyzed to determine how separated the n-grams for each condition are from one another. In deDANSy, 2 scores are generated that define how separated and how distinct are these enriched n-grams for each condition in a comparison of interest.

Please see our bioRxiv paper for full details and the applications of DANSy.

Indices and tables