renum.pdb()

Description

Renumerates residue position of a PDB sequence to match the corresponding UniProt sequence

Usage

renum.pdb(pdb, chain, uniprot)

Arguments

pdb the PDB ID or the path to a pdb file.

chain the chain of interest.

uniprot the UniProt ID.

Value

Returns a dataframe containing the re-numerated sequence.

Dependencies

To use this function you will need MUSCLE installed in your computer.

Details

The ptm package offers a set of ancillary functions aimed to carry out rutinary work, which may be needed when more elaborated analysis are required. Among these ancillary function are:

aa.at
is.at
aa.comp
renum.pdb (the current document)
renum.meto
renum

In many ocasions the numeration of the sequences coming from a PDB structure and its corresponding sequence from UniProt don’t match. When this happens it is useful to have a tool such as renum.pdb() that will re-numerate the residues for us.

Let’s see an example using the protein alpha-1-antitrypsin as model. The mature form of the protein (sequence given by PDB) is formed by proteolytic cleavage of a precursor (sequence given by UniProt). Thus, the processed protein starts at the position 25 from the precursor. Furthermore, the first 23 residues of the mature form are not resolved in the structure given by 3CWM. Thus, the first residue in the 3CWM structure correspond to asparragine 48 in the UniProt sequence.

up_pdb <- renum.pdb(pdb = '3CWM', chain = 'A', uniprot = 'P01009')
kable(up_pdb[43:53, ])

	aln_pos	uni_pos	uniprot	pdb	pdb_pos	pdb_renum
43	43	43	D	–	NA	NA
44	44	44	H	–	NA	NA
45	45	45	P	–	NA	NA
46	46	46	T	–	NA	NA
47	47	47	F	–	NA	NA
48	48	48	N	N	1	48
49	49	49	K	K	2	49
50	50	50	I	I	3	50
51	51	51	T	T	4	51
52	52	52	P	P	5	52
53	53	53	N	N	6	53

We can see that the function renum.pdb() has renumerated all the residues from the PDB structure to force them matching the UniProt numeration.