Description
Renumerates residue position of a PDB sequence to match the corresponding UniProt sequence
Usage
renum.pdb(pdb, chain, uniprot)
Arguments
pdb
the PDB ID or the path to a pdb file.
chain
the chain of interest.
uniprot
the UniProt ID.
Value
Returns a dataframe containing the re-numerated sequence.
See Also
aa.at()
, is.at()
, aa.comp()
, renum.meto()
, renum()
Dependencies
To use this function you will need MUSCLE installed in your computer.
Details
The ptm package offers a set of ancillary functions aimed to carry out rutinary work, which may be needed when more elaborated analysis are required. Among these ancillary function are:
- aa.at
- is.at
- aa.comp
- renum.pdb (the current document)
- renum.meto
- renum
In many ocasions the numeration of the sequences coming from a PDB structure and its corresponding sequence from UniProt don’t match. When this happens it is useful to have a tool such as renum.pdb() that will re-numerate the residues for us.
Let’s see an example using the protein alpha-1-antitrypsin as model. The mature form of the protein (sequence given by PDB) is formed by proteolytic cleavage of a precursor (sequence given by UniProt). Thus, the processed protein starts at the position 25 from the precursor. Furthermore, the first 23 residues of the mature form are not resolved in the structure given by 3CWM. Thus, the first residue in the 3CWM structure correspond to asparragine 48 in the UniProt sequence.
up_pdb <- renum.pdb(pdb = '3CWM', chain = 'A', uniprot = 'P01009')
kable(up_pdb[43:53, ])
aln_pos | uni_pos | uniprot | pdb | pdb_pos | pdb_renum | |
---|---|---|---|---|---|---|
43 | 43 | 43 | D | – | NA | NA |
44 | 44 | 44 | H | – | NA | NA |
45 | 45 | 45 | P | – | NA | NA |
46 | 46 | 46 | T | – | NA | NA |
47 | 47 | 47 | F | – | NA | NA |
48 | 48 | 48 | N | N | 1 | 48 |
49 | 49 | 49 | K | K | 2 | 49 |
50 | 50 | 50 | I | I | 3 | 50 |
51 | 51 | 51 | T | T | 4 | 51 |
52 | 52 | 52 | P | P | 5 | 52 |
53 | 53 | 53 | N | N | 6 | 53 |
We can see that the function renum.pdb() has renumerated all the residues from the PDB structure to force them matching the UniProt numeration.