[3]:
import os, shutil
from CoDIAC import AdjacencyFiles as af
from CoDIAC import PDB
import CoDIAC
Generate mmCIF files for PDBs of interest¶
[7]:
cif_path = './Data/CIF_files'
PDB_list = ['4JGH','3TL0','1D4W']
CoDIAC.PDB.download_cifFile(PDB_list, cif_path)
Generate json files using Arpeggio¶
[8]:
json_path = './Data/JSON_files'
Arpeggio was ran for input mmCIF files of PDBs of interest on High-performance computing system that can allocate more memory for running these jobs.
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --mem=750MB
#SBATCH --time=50:00:00
#SBATCH --partition=standard
#SBATCH -A g_bme-naeglelab
#SBATCH -o SH2_PDB.output
#SBATCH -e SH2_PDB.error
module load anaconda
conda create -n arp-env python=3.7
source activate arp-env
conda install -c openbabel openbabel
pip install biopython
pip install git+https://github.com/PDBeurope/arpeggio.git@master#egg=arpeggio
for fname in *.cif
do
arpeggio $fname -sa
done
The refactored version of Arpeggio that is being made available by PDBe has been used to generate .json files in this work. The pdbe-arpeggio (git release version 1.4.*) was used at the time of contactmap generation.
*pip package installation pip install pdbe-arpeggio
can be used for this purpose as well.
Generate Adjacency txt files¶
[16]:
adj_path='./Data/Adjacency_files'
for PDB_ID in PDB_list:
cif_file = cif_path+'/'+PDB_ID+'.cif'
json_file = json_path+'/'+PDB_ID+'.json'
newpath = adj_path+'/'+PDB_ID
os.mkdir(newpath) #create a directory for PDB to store its corresponding adjacency files
outfile = newpath
CoDIAC.AdjacencyFiles.ProcessArpeggio(json_file, outfile, cif_file, small_molecule=False) #generates adjacency file
CoDIAC.AdjacencyFiles.BinaryFeatures(PDB_ID, outfile, translate_resid=True) #generates binarized version
4JGH
4JGH
Adjacency File generated for 4JGH
Adjacency File with Binary features generated for 4JGH
3TL0
3TL0
Adjacency File generated for 3TL0
Adjacency File with Binary features generated for 3TL0
1D4W
1D4W
Adjacency File generated for 1D4W
Adjacency File with Binary features generated for 1D4W