Mutations

CoDIAC.mutations.OMIM_mutations(uniprot_refFile, api_key, ref_fastaFile, output_featureFile, domain_of_interest, N_offset=0, C_offset=0)[source]

Generates a feature file whose features are mutations found on the domain of interest and are extracted from OMIM database

Parameters:
uniprot_refFilestr

input a uniprot reference file to get a list of uniprot IDs

api_keystr

this key needs to be generated through OMIM to be able to access their programmatic interface

ref_fastaFilestr

fasta file with reference sequences of the domains

domain_of_intereststr

mutations on domains of interest (SH2 domains)

N_offsetint

number of residues to add or remove from the N terminal end of the domain boundary

C_offsetint

number of residues to add or remove from the C terminal end of the domain boundary

CoDIAC.mutations.PDB_mutations(PDB_refFile, ref_fastaFile, output_featureFile, domain_of_interest, N_offset=0, C_offset=0)[source]

Generates a feature file with mutations found on the domain of interest within PDB structures

Parameters:
PDB_refFilestr

input PDB reference file path

ref_fastaFilestr

fasta file with reference sequences of the domains

output_featureFilestr

output feature file path

domain_of_intereststr

mutations on domains of interest (SH2 domains)

N_offsetint

number of residues to add or remove from the N terminal end of the domain boundary

C_offsetint

number of residues to add or remove from the C terminal end of the domain boundary

Returns:
feature file with mutations/variants reported in PDB structures that are present within the domain of interest
CoDIAC.mutations.gnomAD_mutations(fastafile, downloads_path, csvfiles_dir, output_feafile, N_offset=0, C_offset=0)[source]

Mutations found on the domain of interest and are extracted from GnomAD using Uniprot ID and their corresponding Ensemble ID are stored in feature files

Parameters:
fastafilestr

the reference fasta file that is created using UniProt.py and key_array_order= [‘uniprot’, ‘gene’, ‘domain_num’, ‘start’, ‘end’]

downloads_pathstr

the path to the downloads folder on the device where files downloaded are placed

csvfiles_dirstr

path of the directory where we would like to move and save the downloaded variant csv files from the downloads folder

output_feafilestr

the path to save the output feature file

N_offsetint

number of residues to add or remove from the N terminal end of the domain boundary

C_offsetint

number of residues to add or remove from the C terminal end of the domain boundary