S-Aromatic Motifs and Calmodulin

In the large inventory of the benefits provided by the presence of a sulfur atom in the side chain of methionine, we can find those effects derived from the so-called S-aromatic motifs. Indeed, an interaction of methionine and nearby aromatic residues (Phe, Tyr and Trp) was described as early as in the mid-eighties. Even earlier, a frequency of sulfur and aromatic ring in close proximity within proteins higher than expected had been noticed. Despite the potential importance of these findings, they went largely overlooked, perhaps because the physicochemical nature of this bond is only poorly understood.

Although the strength of these interactions may depend on the conditions of its environment, it is accepted that the S-aromatic interaction occurs at a greater distance (5-7 Å) than a salt bridge (< 4 Å), while the energies associated with either interaction are comparable. More recently, extensive surveys of the Protein Data Bank have revealed the importance of the methionine-aromatic motif for stabilizing protein structures and for protein-protein interactions (see the review Methionine in proteins: The Cinderella of the proteinogenic amino acids for further details).

When searching for S-aromatic motifs in proteins, two relevant variables are the distance between the delta sulfur atom (SD) and the centroid of the aromatic ring, which can be computed as the euclidean norm of the vector \mathbf{V_S} shown in the figure below

d = \lVert \mathbf{V_S} \rVert

The other relevant variable is the angle theta. This angle is defined as that between the sulfur-aromatic vector (V_S) and the normal vector of the aromatic ring (n). This angle is complementary to the angle of elevation of the sulfur above the plane of the ring. In the figure, CG, SD and CE stand for carbon gamma, sulfur delta and carbon epsilon of the methionine residue.

Note that two extreme cases are possible: a face-on interaction (angles around 0º) and an edge-on interaction (involving angles around 90º).

Currently, the ptm package offers three functions that may be useful for the study of these S-aromatic motifs:

To illustrate the use of these functions we are going to use calmodulin (CaM) as a model protein. Calmodulin is a multifunctional calcium-binding protein expressed in all eukaryotic cells. The binding of the secondary messenger Ca2+ to CaM changes the CaM conformation and its affinity for its various targets. Methionine residues of CaM play an essential role in the sequence-independent specific binding to its many protein targets. In a Ca2+ free medium, CaM exhibits a spatial conformation corresponding to the so-called apoCaM (PDB ID: 1CFD). On the other hand, when CaM binds two Ca2+ ions, the protein adopts a different spatial conformation corresponding to the holoCaM (PDB ID: 1CLL).

If we pass the PDB ID of the apoCaM to the function saro.motif(pdb = ‘1cfd’, threshold = 6), and we place the output in an object we may call apo_cam, we can observe that this form of the calmodulin exhibits 8 S-aromatic motifs close to the edge-on type.


If we repeat the process, but now for the holoCaM (PDB ID 1CLL):


These tables show a dynamic remodeling of the S-aromatic bonds dependent on calcium ions, which is better summarized in the form of a network graph:

S-aromatic network

The information provided by this dynamic network can be complemented using the function ddG.ptm(), which will inform us about the changes in thermodynamic stability caused by modification of the corresponding methionine residues.

To this end, we need the UniProt ID of human calmodulin-1, which is P0DP23. Nevertheless, if all we know is that we are interested in human calmodulin-1, and we don’t want to leave the R IDE to search the desired identifier, we can make use of the function meto.list(), typing:

id <- meto.list('Calmodulin-1')$prot_id[which(meto.list('Calmodulin-1')$prot_sp == 'Homo sapiens')]

Now we get the amino acid sequence of this protein:

seq <- get.seq(id) # CaM sequence
seq <- substring(seq, 2, nchar(seq)) # Removing the initiation methionine
met <- gregexpr("M", seq)[[1]] # Positions at which methionines are found

Once we know at which positions the protein exhibits a methionine residue, we are in conditions to compute the change in free energy after modifying those residues. To this end, the arguments taken by the ddG.ptm() functions are the PDB ID of interest, the position of the residue to be modified, and the type of post-translation modification to be introduced. In our case, we set ptm = ‘MetO-Q’ that, as explained somewhere else, mimics the sulfoxidation of methionine.

meto_apo <- lapply(met, function(x) ddG.ptm('1cfd', pos = x, 'A', ptm = 'MetO-Q'))
meto_apo <- round(as.numeric(unlist(meto_apo)), 3)

meto_holo <- lapply(met, function(x) ddG.ptm('1cll', pos = x, 'A', ptm = 'MetO-Q'))
meto_holo <- round(as.numeric(unlist(meto_holo)), 3)

The objects meto_apo and meto_halo contain the Gibbs free energy change values in kcal/mol. Now, let’s plot them:

oldpar <- par()
        horiz = TRUE, names.arg = met, cex.names = 0.6,
        xlab = expression(paste(Delta, Delta, "G (kcal/mol)", sep = "")),
        ylab = "Modified Met",
        col = "red",
        main = "ApoCaM")

        horiz = TRUE, names.arg = met, cex.names = 0.6,
        xlab = expression(paste(Delta, Delta, "G (kcal/mol)", sep = "")),
        ylab = "Modified Met",
        col = "blue",
        main = "HoloCaM")

If we want to obtain a greater wealth of raw data related to the inter-residual distances between the sulfur atom and the aromatic ring, we can choose the function saro.dist()

raw_data <- saro.dist('1cfd', rawdata = TRUE)
##   Note: Accessing on-line PDB file

The output of this function is a list of two elements. The first is a dataframe where each row is a methionine residue. For each methionine residue this dataframe provides the identity of the closest tyrosine, phenylalanine and tryptophan, as well as the corresponding distances in ångströms. Also, the number of S-aromatic bonds that each methionine residue forms with tyrosine, phenylalanine and tryptophan is provided.

closest <- raw_data[[1]]

The second one is a dataframe where each row corresponds to a methionine residue and each column provides the distance in ångströms to the indicated aromatic residue. There will be as many columns as aromatic residues are found in the given protein.

all_distances <- raw_data[[2]]
kable(all_distances[, 1:9])

Finally, if we are interested in a particular pair of residues, let’s say M76 and F12, we can use:

kable(saro.geometry('1cll', rA = 12, rB = 76))
##   Note: Accessing on-line PDB file
##    PDB has ALT records, taking A only, rm.alt=TRUE

where x, y and z are the Cartesian coordinates of either the sulfur delta of methionine or the centroid of the aromatic ring.