DATA SETS
Data sets used in the study and detailed validation results. The respective PDB files are accessible as zip archives in column “STRUCTURES”.
| Dataset | Size | Structures | Prediction | Validation | Source/Ref. | 
|---|---|---|---|---|---|
| SP1 | 2648 single point mutations | 131 | ΔΔG | 5-fold cross validation | Dehouck et al., 2009 | 
| SP2 | 350 single point mutations | 67 | ΔΔG | performance test | Dehouck et al., 2009 | 
| SP3 | 1925 single point mutations | 55 | ΔΔG | 20-fold cross validation | Masso et al., 2008; Capriotti et al., 2005 | 
| SP4 | 1765 single point mutations | 98 | ΔΔG | 10-fold cross validation | ProTherm DB | 
| MP | 479 multi point mutations | 57 | ΔΔG | 10-fold cross validation | ProTherm DB | 
| SS1 | 75 disulfide bonds | 75 | S-S bond | performance test | Salam et al., 2014 | 
| minimized structures | 75 | S-S bond | performance test | (Salam et al., 2014)* | |
| SS2 | 15 engineered disulfide bonds | 13 | S-S bond | performance test | Salam et al., 2014 | 
The meaning of the sign of a ∆∆G varies from data set to data set. We defined negative ∆∆G as an increase in the stability of a protein, while positive values indicates a destabilization and adopted all data sets to this definition.
*) The models (minimized structure) used in Salam et al. 2014 were not available. Therefore, we applied a comparable procedure and provide the resulting structures here.