InterPro

CoDIAC.InterPro.appendRefFile(input_RefFile, outputfile)[source]

Takes a reference file generated made by CoDIAC.UniProt.makeRefFile and adds interpro domain metadata as a new column (i.e. this appends domain information defined by InterPro to the Uniprot reference)

Parameters:
input_RefFile: string

name of the input reference file generated from the makeRefFile function in CODAC.py

outputfile: string

name of the file to be outputted by this function

Returns:
df: Pandas Dataframe

In addition to printing the dataframe to a CSV file (as defined by outputfile) this returns the dataframe that is prented

CoDIAC.InterPro.fetch_uniprotids(interpro_ID, REVIEWED=True, species='Homo sapiens')[source]

Given an InterPro_ID, fetch all the records (or only reviewed records) for all species or for a specific taxonomy.

Examples: Examples, use this module like this, the first being a more restrictive human with reviewed, versus all records associated within an Interpro ID:
fetch_uniprotids('IPR000980', REVIEWED=True, species='Homo sapiens') # human proteins with reviewed records
fetch_uniprotids('IPR000980', REVIEWED=False, species='all') #all species records, reviewed and unreviewed
Parameters:
interpro_ID: str

InterPro ID to search for

REVIEWED: bool

If TRUE, only reviewed records will be returned

species: string

Using scientific name under the Uniprot taxonomy to define species. See here for taxonomy names: https://www.uniprot.org/taxonomy?query=*

Returns:
uniprot_ID_list: list

list of all uniprot IDs found in search. If a species was set, all uniprot IDs for a species will be returned in this list, otherwise, all species from search will be returned.

species_dict: dict

Dictionary, with top keys equal to the species scientific name and points to an inner dict that keeps track of the database source ‘reviewed’ or ‘unreviewed’ and has lists of the uniprot IDs found for that species under that database source.