abundance()

Description

Provides the protein abundance, in parts per million, of the requested protein.

Usage

abundance(id, …)

Arguments

id the UniProt identifier of the protein of interest.

... either ‘jarkt’ or ‘hela’ if required.

Value

A numeric value for the abundance, expressed a parts per million (ppm), of the requested protein.

References

Wang et al. Proteomics 2015, 15(18):3163-3168.

Details

Protein quantification at proteome‐wide scale is an important aim, being protein abundance a relevant variable in many biological studies. Indeed, protein abundance is a major factor for the detection of PTMs by mass spectrometry. Thus, avoiding abundance bias in the functional annotation of post-translationally modified proteins is a real concern. Therefore, when performing analyses of data related to PTMs often is necessary to account for potential biases due to differential protein abundance. For this purpose, an accurate knowledge of abundance data of the protein being analyzed is necessary.

The function abundance() provides protein abundance data that are taken from PaxDB, which is a comprehensive absolute protein abundance database. Currently, the function provides data for proteins from the species:

  • Arabidopsis thaliana
  • Bos taurus
  • Dictyostelium discoideum
  • Drosophila melanogaster
  • Equus caballus
  • Escherichia coli
  • Gallus gallus
  • Homo sapiens

For human proteins, in addition to the abundance in the whole organism (by default), the abundance found in Jurkat or HeLa cells can be requested.

To illustrase the use of abundance() we are going to reproduce a result recently published (Antioxidants 2020, 9(10), 987) where we showed that human proteins being oxidized within living cells are less abundant than proteins oxidized in vitro in cell extracts.

First we download from MetOSite the set of relevant proteins.

sites <- ptm::meto.search(organism = 'Homo sapiens',
                     oxidant = 'hydrogen peroxide')
vivo <- unique(sites$prot_id[which(sites$met_vivo_vitro == "vivo")])
vitro <- unique(sites$prot_id[which(sites$met_vivo_vitro == "vitro")])
intersection <- intersect(vivo, vitro)
vivo <- setdiff(vivo, intersection)
vitro <- setdiff(vitro, intersection) 

data <- data.frame(id = c(vivo, vitro), 
                   category = c(rep("vivo", length(vivo)), rep("vitro", length(vitro))))

Now we compute the protein abundance for each protein from our set.

data$abu <- NA
for (i in 1:nrow(data)){
  t <- abundance(as.character(data$id[i]))
  if (length(t) > 0){
    data$abu[i] <- t[1]
  }
}

Afterwards, the MetO-containing proteins are ranked in increasing order according to their abundances. Those rank positions occupied by a protein from the in vivo set are marked in red while rank positions where a protein from the in vitro set is present are in blue.

data <- data[!is.na(data$abu), ]
data <- data[order(data$abu), ]
data$rank <- 1:nrow(data)
data$bar <- 1
data$color <- NA
data$color[which(data$category == 'vivo')] <- 'red'
data$color[which(data$category == 'vitro')] <- 'blue'


plot(1:nrow(data), data$bar, ty='h', ylim = c(0,1),
      yaxt = 'n', axes = FALSE, ylab = '', xlab = "Abundance rank", col = data$color)
axis(side = 1)

We can observe that the red bars (proteins oxidized in vivo) are predominant in the low rank region while the blue bars (protein oxidized in vitro) are abundand in the high rank region.