Renumerates residue position of a PDB sequence to match the corresponding UniProt sequence
renum.pdb(pdb, chain, uniprot)
pdb the PDB ID or the path to a pdb file.
chain the chain of interest.
uniprot the UniProt ID.
Returns a dataframe containing the re-numerated sequence.
To use this function you will need MUSCLE installed in your computer.
The ptm package offers a set of ancillary functions aimed to carry out rutinary work, which may be needed when more elaborated analysis are required. Among these ancillary function are:
In many ocasions the numeration of the sequences coming from a PDB structure and its corresponding sequence from UniProt don’t match. When this happens it is useful to have a tool such as renum.pdb() that will re-numerate the residues for us.
Let’s see an example using the protein alpha-1-antitrypsin as model. The mature form of the protein (sequence given by PDB) is formed by proteolytic cleavage of a precursor (sequence given by UniProt). Thus, the processed protein starts at the position 25 from the precursor. Furthermore, the first 23 residues of the mature form are not resolved in the structure given by 3CWM. Thus, the first residue in the 3CWM structure correspond to asparragine 48 in the UniProt sequence.
up_pdb <- renum.pdb(pdb = '3CWM', chain = 'A', uniprot = 'P01009') kable(up_pdb[43:53, ])
We can see that the function renum.pdb() has renumerated all the residues from the PDB structure to force them matching the UniProt numeration.