p.scan()

Description

Scans the indicated protein in search of phosphosites

Usage

p.scan(up_id, db = 'all')

Arguments

up_id a character string corresponding to the UniProt ID.

db the database where to search. It should be one among ‘PSP’, ‘dbPTM’, ‘dbPAF’, ‘PhosPhAt’, ‘Phospho.ELM’, ‘all’.

Value

Returns a dataframe where each row corresponds to a phosphorylatable residue.

References

Hornbeck et al. Nucleic Acids Res. 2019 47:D433-D441.
Huang et al. Nucleic Acids Res. 2019 47:D298-D308.
Ullah et al. Sci. Rep. 2016 6:23534.
Durek et al. Nucleic Acids Res.2010 38:D828-D834.
Dinkel et al. Nucleic Acids Res. 2011 39:D261-D567.

See Also

ac.scan(), meto.scan(), ni.scan(), gl.scan(), ub.scan(), su.scan(), dis.scan(), sni.scan(), me.scan(), ptm.scan(), reg.scan()

Details

Protein phosphorylation is one of the most important post-translational modifications (PTMs) and regulates a broad spectrum of biological processes.

Recent progresses in phosphoproteomic identifications have generated a flood of phosphorylation sites. Various curated databases containing known phosphorylation sites have been described and are accessible.

The package ptm provides a function, p.scan(), that aims to integrate information from different phosphorylation site databases, and assist to identify phosphosites in a given protein. The UniProt ID of the protein of interest must be passed as argument. A second argument for this function is db, used to indicate the database to be searched. The databases accessible for this purpose are:

  • PSP: PhosphoSitePlus.
  • dbPTM: database of Post-Translational Modifications.
  • dbPAF: database of Phosphosites in Animals and Fungi.
  • PhosPhAt: The Arabidopsis Protein Phosphorylation Site Database.
  • Phospho.ELM: a database of S/T/Y phosphorylation sites.

By default, p.scan() use all above databases to search for phosphosites. It should be noted that the number of phosphosites recovered for a protein may depend on the database consulted. For instance, let’s suppose we are interested in the protein alfa-1-antitrypsin (P01009). If we search in dbPTM or dbPFA we will recover the same 7 phosphosites:

mydbPTM <- p.scan('P01009', db = 'dbPTM')$modification
mydbPTM

## [1] "S38-p"  "Y184-p" "S261-p" "Y321-p" "T333-p" "S383-p" "T416-p"
mydbPAF <- p.scan('P01009', db = 'dbPAF')$modification
mydbPAF

## [1] "Y184-p" "S261-p" "Y321-p" "T333-p" "S38-p"  "S383-p" "T416-p"

However, if we carry out the query using PhosphoSitePlus:

myPSP <- p.scan('P01009', db = 'PSP')$modification
intersect(myPSP, mydbPTM)

## [1] "S38-p"  "Y184-p" "S261-p" "Y321-p" "T333-p" "S383-p" "T416-p"

We get the same 7 phophosphosite that we already knewy and 8 additional phosphosites, which were not present in dbPTM nor dbPFA.

setdiff(myPSP, mydbPTM)

## [1] "T35-p"  "T37-p"  "S307-p" "S309-p" "S316-p" "T318-p" "S325-p" "S343-p"

Thus, if we want the most comprehensive results, the best option is set db = ‘all’:

unique(p.scan('P01009', db = 'all')$modification)

##  [1] "T35-p"  "T37-p"  "S38-p"  "Y184-p" "S261-p" "S307-p" "S309-p" "S316-p" "T318-p"
## [10] "Y321-p" "S325-p" "T333-p" "S343-p" "S383-p" "T416-p"

Nevertheless, it is worth remembering that not always ‘more is better’. To this respect, mass spectrometric methods are revealing a very large number of protein phosphorylation sites in cells. However, do each of these phosphorylations have a role in the functioning of a cellular process? We do not know the answer to this question because the functional effects of the vast majority of these phosphorylations have not yet been investigated. However, there are reasons for believing that some portion of these reported phosphorylations might have little or no functional importance. Therefore, if you want to use an astringent criterion, I’d recoment to use the ptm function reg.scan(), which scans only for experimentally documented regulatory PTM sites. Thus, using this function with our model protein P01009, it turns out than only one, among the fifteen sites previously reported, is known to be functionally relevant.

reg <- reg.scan('P01009')
reg[which(grepl('-p', reg$modification)),]

##        up_id     organism modification database
## 10928 P01009 Homo sapiens       T416-p      PSP