DATA SETS
Data sets used in the study and detailed validation results. The respective PDB files are accessible as zip archives in column “STRUCTURES”.
Dataset | Size | Structures | Prediction | Validation | Source/Ref. |
---|---|---|---|---|---|
SP1 | 2648 single point mutations | 131 | ΔΔG | 5-fold cross validation | Dehouck et al., 2009 |
SP2 | 350 single point mutations | 67 | ΔΔG | performance test | Dehouck et al., 2009 |
SP3 | 1925 single point mutations | 55 | ΔΔG | 20-fold cross validation | Masso et al., 2008; Capriotti et al., 2005 |
SP4 | 1765 single point mutations | 98 | ΔΔG | 10-fold cross validation | ProTherm DB |
MP | 479 multi point mutations | 57 | ΔΔG | 10-fold cross validation | ProTherm DB |
SS1 | 75 disulfide bonds | 75 | S-S bond | performance test | Salam et al., 2014 |
minimized structures | 75 | S-S bond | performance test | (Salam et al., 2014)* | |
SS2 | 15 engineered disulfide bonds | 13 | S-S bond | performance test | Salam et al., 2014 |
The meaning of the sign of a ∆∆G varies from data set to data set. We defined negative ∆∆G as an increase in the stability of a protein, while positive values indicates a destabilization and adopted all data sets to this definition.
*) The models (minimized structure) used in Salam et al. 2014 were not available. Therefore, we applied a comparable procedure and provide the resulting structures here.