{ "cells": [ { "cell_type": "code", "execution_count": 3, "id": "3ef224aa-b7c9-4d42-85e9-b6ddf68ef77e", "metadata": {}, "outputs": [], "source": [ "import os, shutil\n", "from CoDIAC import AdjacencyFiles as af\n", "from CoDIAC import PDB\n", "import CoDIAC" ] }, { "cell_type": "markdown", "id": "5fe6f65d-fd11-4772-a261-e23d56b84f78", "metadata": {}, "source": [ "### Generate mmCIF files for PDBs of interest" ] }, { "cell_type": "code", "execution_count": 7, "id": "6657e221-e237-46f3-821a-0a0f92633be0", "metadata": {}, "outputs": [], "source": [ "cif_path = './Data/CIF_files'\n", "PDB_list = ['4JGH','3TL0','1D4W']\n", "CoDIAC.PDB.download_cifFile(PDB_list, cif_path)" ] }, { "cell_type": "markdown", "id": "c4796fba-d9b8-45c4-aa28-f6e7d38dbabf", "metadata": {}, "source": [ "### Generate json files using Arpeggio" ] }, { "cell_type": "code", "execution_count": 8, "id": "76263b92-e828-4066-8c4f-440dfff45ddc", "metadata": {}, "outputs": [], "source": [ "json_path = './Data/JSON_files'" ] }, { "cell_type": "markdown", "id": "d353afd2-bf59-40c9-a4bc-07973e603a8a", "metadata": {}, "source": [ "Arpeggio was ran for input mmCIF files of PDBs of interest on High-performance computing system that can allocate more memory for running these jobs.\n", "\n", "\n", "\n", " #!/bin/bash\n", " #SBATCH --nodes=1\n", " #SBATCH --mem=750MB\n", " #SBATCH --time=50:00:00\n", " #SBATCH --partition=standard\n", " #SBATCH -A g_bme-naeglelab\n", " #SBATCH -o SH2_PDB.output\n", " #SBATCH -e SH2_PDB.error\n", "\n", " module load anaconda\n", " conda create -n arp-env python=3.7\n", " source activate arp-env\n", " conda install -c openbabel openbabel\n", " pip install biopython\n", " pip install git+https://github.com/PDBeurope/arpeggio.git@master#egg=arpeggio\n", " \n", " for fname in *.cif\n", " do\n", " arpeggio $fname -sa\n", " done\n", " \n", "The refactored version of [Arpeggio](https://github.com/harryjubb/arpeggio) that is being made available by [PDBe](https://github.com/PDBeurope/arpeggio) has been used to generate .json files in this work. The pdbe-arpeggio (git release version 1.4.*) was used at the time of contactmap generation. \n", "\n", "*pip package installation ```pip install pdbe-arpeggio``` can be used for this purpose as well. \n" ] }, { "cell_type": "markdown", "id": "b07d96d7-149f-4d9a-aec5-36af9ced30d1", "metadata": {}, "source": [ "### Generate Adjacency txt files " ] }, { "cell_type": "code", "execution_count": 16, "id": "2f83745c-be8e-4401-b991-8a4693da7b61", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4JGH\n", "4JGH\n", "Adjacency File generated for 4JGH\n", "Adjacency File with Binary features generated for 4JGH\n", "3TL0\n", "3TL0\n", "Adjacency File generated for 3TL0\n", "Adjacency File with Binary features generated for 3TL0\n", "1D4W\n", "1D4W\n", "Adjacency File generated for 1D4W\n", "Adjacency File with Binary features generated for 1D4W\n" ] } ], "source": [ "adj_path='./Data/Adjacency_files'\n", "for PDB_ID in PDB_list: \n", " cif_file = cif_path+'/'+PDB_ID+'.cif'\n", " json_file = json_path+'/'+PDB_ID+'.json'\n", " newpath = adj_path+'/'+PDB_ID \n", " os.mkdir(newpath) #create a directory for PDB to store its corresponding adjacency files\n", " outfile = newpath\n", " CoDIAC.AdjacencyFiles.ProcessArpeggio(json_file, outfile, cif_file, small_molecule=False) #generates adjacency file \n", " CoDIAC.AdjacencyFiles.BinaryFeatures(PDB_ID, outfile, translate_resid=True) #generates binarized version\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 5 }