pdb.quaternary()

Description

Determines the subunit composition of a given protein

Usage

pdb.quaternary(pdb, keepfiles = FALSE)

Arguments

pdb The path to the PDB of interest or a 4-letter identifier.

keepfiles logical, if TRUE the fasta file containing the alignment of the subunits is saved in the current directory, as well as the splitted pdb files.

Value

This function returns a list with four elements: (i) a distances matrix, (ii) the sequences, (iii) chains id, (iv) the PDB ID used.

Details

The ptm package contains a number of ancillary functions that deal with Protein Data Bank (PDB) files. These functions may be useful when structural 3D data need to be analyzed. The mentioned functions are:

Quaternary structure exists in proteins consisting of two or more identical or different polypeptide chains (subunits). These proteins are called oligomers because they have two or more subunits. The quaternary structure describes the manner in which subunits are arranged in the native protein. When dealing with a PDB file of a protein possessing quaternary structure, the function pdb.quaternary() can provide us some interesting information. For instance, let’s work with the human deoxyhaemoglobin (2HHB) as an example:

Hb <- pdb.quaternary('2hhb', keepfiles = TRUE)
                                                                 

First, we can check that this protein has four subunits (polypeptide chains) that are identified in the PDB files as:

Hb[[3]]
## [1] "A" "B" "C" "D"

Second, we can ask whether these subunits are or are not identical to each other:

Hb[[1]]
##           B         D         A         C
## B 0.0000000 0.0000000 0.7345532 0.7345532
## D 0.0000000 0.0000000 0.7345532 0.7345532
## A 0.7345532 0.7345532 0.0000000 0.0000000
## C 0.7345532 0.7345532 0.0000000 0.0000000

This distance matrix tell us that subunits A and C, on one hand, and subunits B and D, on the other hand, are identical to each other. Furthermore, the sequence distance between A and B (or C and D) is quantified (0.7346).

We can also ask for these sequences:

Hb[[2]]
## [[1]]
## [1] "VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR"
## 
## [[2]]
## [1] "VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH"
## 
## [[3]]
## [1] "VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR"
## 
## [[4]]
## [1] "VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH"

In addition, if we have passed the argument fas = TRUE to the function pdb.quaternary(), then we can read a fasta file with the alignment between these sequences:

read.fasta(paste("./", Hb[[4]], ".fa", sep = ""))
##     1        .         .         .         .         .         .         70 
## B   VHLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVL
## D   VHLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVL
## A   V-LSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GSAQVKGHGKKVA
## C   V-LSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GSAQVKGHGKKVA
##     * *^* ^*^ * * ****     * *^*** *^ ^ ^* *  ^*  * ***      *   **^*****  
##     1        .         .         .         .         .         .         70 
## 
##    71        .         .         .         .         .         .         140 
## B   GAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVA
## D   GAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVA
## A   DALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVS
## C   DALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVS
##      * ^ ^^**^* ^      **^**  **^*** **^**   *^  ** *   **** * *   * ^* *  
##    71        .         .         .         .         .         .         140 
## 
##   141      148 
## B   NALAHKYH
## D   NALAHKYH
## A   TVLTSKYR
## C   TVLTSKYR
##       *  **^ 
##   141      148 
## 
## Call:
##   read.fasta(file = paste("./", Hb[[4]], ".fa", sep = ""))
## 
## Class:
##   fasta
## 
## Alignment dimensions:
##   4 sequence rows; 148 position columns (139 non-gap, 9 gap) 
## 
## + attr: id, ali, call