Introduction

Point mutations can have a strong impact on protein stability. A change in stability may subsequently lead to dysfunction and finally cause diseases. Moreover, protein engineering approaches aim to deliberately modify protein properties, where stability is a major constraint.

In order to support basic research and protein design tasks we have implemented MAESTRO, a Multi AgEnt STability pRedictiOn tool for changes in unfolding free energy upon point mutation. For details on the method please see: MAESTRO - multi agent stability prediction upon point mutations. Laimer J, Hofer H, Fritz M, Wegenkittl S and Lackner P. BMC Bioinformatics. 2015 Apr 16;16(1):116.

Please note that MAESTRO requires experimentally resolved 3D protein structures, although it also accepts modeled 3D structures as input. However, the prediction quality may decrease when using predicted structural models. Also note that MAESTRO can operate on multi-chain complexes. There, mutations are always propagated to all identical chains and values for the predicted ΔΔG are reported as total.

MAESTROweb provides an easy to use web interface to the MAESTRO software for four different tasks:

1	Evaluate the impact of a list of specific mutations.
2	Calculate a mutation sensitivity profile.
3	Perform a search for the most (de)stabilizing n-point mutation(s).
4	Evaluate potential disulfide bonds.

The resulting data are visualized as interactive tables and can be exported as a csv data file.

Input fields, selectors and sliders in the Web-GUI are supplemented with mouse over tool-tips. Below we only provide additional information not covered by the tool-tips.

A brief overview of the performance of MAESTRweb for the prediction of stability changes(ΔΔG) as well as the evaluation of disulfide bonds is given at the end of this page.

Workflow

Open project

In any case, MAESTRO requires a 3D protein structure, which can be specified by the corresponding 4-letter PDB Id or by uploading coordinates in PDB format. In case of specifying the protein by PDB Id, either the crystallographic or the biological units can be selected. For an explanation see here. If the given PDB Id corresponds to an NMR structure, in addition the model can be specified.

Uploaded structures are used as a whole. If the uploaded file contains multiple models (e.g. NMR structures) the analyses are performed on the first model. If the file should be considered as biological assembly, please check the corresponding checkbox.

The input structures undergo checks for missing residues (chain breaks) or atoms. Users are notified thereof, as this may decrease the prediction quality. In case of a few missing atoms or short missing loops we recommend to employ modeling tools such as MODELLER or SWISSMODEL to generate complete structures for manual upload.

Once a PDB Id is submitted or a file is uploaded, MAESTROweb stores the structure under a certain project Id (a 32 digit MD5 hash). Results are stored for seven days on the server and can be retrieved using a project link of the format: https://biwww.che.sbg.ac.at/maestro/web/result/<project Id>

Select task

Select one of the four tasks for your project.

Except the calculation of the sensitivity profile, the tasks need to be further specified.

Evaluate specific mutations

Select the mutation site(s) and replacement amino acids (AA) of your interest. Replacement AA types can be specified by residue name or AA class. The list of mutation can be edited. The format for a list entry is:

<wild type AA><pdb residue number>.<chain id(s)>{replacement AA}

where the replacement AA can be a comma separated list of types and classes, e.g. {CHARGED,S,T}, {X} denotes any AA.

The list can be processed as independent single point mutations or as combined mutations. The latter mode may lead to a long execution time, depending on the number of mutation sites and the number of replacement AA's.

Example result:

Click on a list item to highlight the position in the 3D graphics. You may go back to the "Specify task" tab to perform a new experiment on the same structure. However, on the server only the last result is stored for later retrieval by the Project-Id.

Calculate a mutation sensitivity profile

This plot is useful to determine positions where mutations are predicted to have little effect on the stability, or if they are predominantly stabilizing or destabilizing.

Example result:

Scan for (de)stabilizing mutations

The provided options should cover the most application scenarios. Further options such as advanced residue restriction as well as a brute force search is provided by the stand-alone version of MAESTRO.

Example result:

Evaluate potential disulfide bonds

A purely geometric filter first determines positions in the structure, where SS-bonds would fit. Subsequently, the required mutations are evaluated in terms of ΔΔG and overall score S_ss, integrating ΔΔG and geometric penalties.

Example result:

ΔΔG Prediction Performance

For the prediction of stability changes upon point mutations (ΔΔG) MAESTRO was trained on combination of two data sets derived from the ProTherm database. The first set (SP4) consists of 1765 single point mutants, the second set (MP) provides 479 mutants with multiple mutations. Both sets were restricted to entries with a pH value between 5.5 and 8.5 and can be downloaded from here. For further information about the data sets see here. We performed a 10-fold cross validation on this combined set and a blind test to investigate the generalization qualities of our approach, when a certain protein is never used in the training. For this, we performed a 10-fold protein blind test, where all mutations of a protein were either exclusively in the training or in the test set. In the table below the results of both experiments are shown for single-point and multi-point mutants.

Number of	Number of	10-fold cross validation		10-fold protein blind test
mutations	entries	ρ	σ	ρ	σ
1	1765	0.68	1.32	0.59	1.46
>1	479	0.71	1.52	0.54	1.87
≥1	2244	0.69	1.36	0.57	1.56

The relatively high decrease of prediction performance in case of multi-point mutations is affected by the small number of different proteins in the data set, the wide variety of mutants per protein and therewith the variety in the fold size. For further performance results as well as a detailed comparison with competitive methods see here.

Confidence estimation c_pred.

The MAESTRO approach is based on a consensus prediction of a couple of prediction agents. Utilizing the deviation of the agents predictions, MAESTRO provides a confidence estimation (c_pred., not be confused with a confidence interval in the statistical sense) for its ΔΔG predictions. The confidence estimation is numerically confined to values between 0.0 and 1.0, where 1.0 corresponds to a perfect consensus of all agents. As shown in the following plot, c_pred. provides a sound estimation of the prediction accuracy.

In both cases, for single-point mutants as well for multi-point mutants, the prediction error (absolute difference between experimentally determined and predicted ΔΔG) decreases with higher confidence values.

pH Value

The stability of a protein is affected by the surrounding solvent, especially by its pH value. Therefore, MAESTRO employs the pH associated with ΔΔG in the training data. For the prediction, a certain pH value (between 5.5 and 8.5, see training set) can be supplied by the user. As shown in plot below a correct pH value (blue) positively influences the prediction accuracy. If there is no evidence for the pH value, the default pH=7.0 (gray) is still a good choice.

As mentioned before, the pH is restricted to values between 5.5 and 8.5. Its validity will be checked before a prediction starts. This validation depends on the HTML5 specification, which is still not fully supported by some browsers. In this case, the pH value is validated during the prediction and invalid values will be automatically set to 5.5 or 8.5, respectively.

Disulfide Bonds Prediction

For the disulfide bond prediction MAESTRO was trained on a data set published by Masso et al., 2008 (SP3) instead of SP4/MP as this SP3 provides the best results for this application. The set includes 1925 stability change measurements in 55 different proteins. In short, for the disulfide scan all residue pairs of a protein with a C_β-C_β distance closer than 5Å are considered as potential binding partners. For these residue pairs, ΔΔG and geometry penalties are calculated and combined to the final disulfide score S_ss.

The performance of the disulfide bond predictor was evaluated on a data set of 15 engineered disulfide bonds (SS2), published by Salam et al. The data set comprises 13 proteins without disulfide bonds in the wild type, where variants with stabilizing disulfide bonds have been engineered. When performing a prediction on these proteins with MAESTRO, the average relative rank of the engineered SS-bonds was top 20% regarding S_ss. In one third of the cases the engineered the SS-bond was within the top five predictions. For further results see here.

Introduction

Workflow

Open project

Select task

Evaluate specific mutations

Calculate a mutation sensitivity profile

Scan for (de)stabilizing mutations

Evaluate potential disulfide bonds

ΔΔG Prediction Performance

Confidence estimation cpred.

pH Value

Disulfide Bonds Prediction

Confidence estimation c_pred.