{"id":44,"date":"2014-04-15T13:15:01","date_gmt":"2014-04-15T13:15:01","guid":{"rendered":"http:\/\/pbwww.che.sbg.ac.at\/wordpress\/?page_id=44"},"modified":"2021-02-13T17:27:00","modified_gmt":"2021-02-13T16:27:00","slug":"reference-structural-alignments","status":"publish","type":"page","link":"https:\/\/pbwww.services.came.sbg.ac.at\/?page_id=44","title":{"rendered":"Reference Pairwise and Multiple Structural Alignments"},"content":{"rendered":"<p style=\"text-align: justify;\">Evaluation of sequence and structure alignment methods requires reference alignments. Data-sets provided here represent such reference alignments. Part of these sets have been used in &#8220;Comparative analysis of protein structure alignments&#8221; (Mayr et al. <a href=\"http:\/\/www.biomedcentral.com\/1472-6807\/7\/50\" target=\"_blank\" rel=\"noopener noreferrer\">[1]<\/a>), where we have evaluated <i>pairwise<\/i> structure alignment methods according their consistency and accuracy.<\/p>\n<p style=\"text-align: justify;\">Since this work we have updated the data-sets used in <a href=\"http:\/\/www.biomedcentral.com\/1472-6807\/7\/50\" target=\"_blank\" rel=\"noopener noreferrer\">[1]<\/a> and added new data-sets for multiple alignments.<\/p>\n<p style=\"text-align: justify;\">There are three basic sets:<\/p>\n<menu>\n<li>The RIPC set contains protein pairs exhibiting very difficult structural relations including <span style=\"color: #ff6600;\"><b id=\"emph\">r<\/b><\/span>epetitions, large <span style=\"color: #ff6600;\"><b id=\"emph\">I<\/b><\/span>nDels, circular <span style=\"color: #ff6600;\"><b id=\"emph\">p<\/b><\/span>ermutations and <span style=\"color: #ff6600;\"><b id=\"emph\">c<\/b><\/span>onformational variability.<\/li>\n<li>The SISY pairwise set contains protein pairs selected from the <a href=\"http:\/\/www.spice-3d.org\/sisyphus\/\" target=\"_blank\" rel=\"noopener noreferrer\">Sisyphus database<\/a>, which provides structural alignments for proteins with non-trivial relationships (Andreeva et al. <a href=\"http:\/\/nar.oxfordjournals.org\/content\/35\/suppl_1\/D253.full?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=&amp;fulltext=sisyphus&amp;searchid=1&amp;FIRSTINDEX=0&amp;resourcetype=HWCIT\" target=\"_blank\" rel=\"noopener noreferrer\">[3]<\/a>).<\/li>\n<li>The SISY multiple set contains protein families selected from the <a href=\"http:\/\/www.spice-3d.org\/sisyphus\/\" target=\"_blank\" rel=\"noopener noreferrer\">Sisyphus database<\/a>.<\/li>\n<\/menu>\n<p style=\"text-align: justify;\">The data-sets are up to date with PDB Nov. 2008, SCOP 1.73 and Sisyphus 1.3. We introduced an xml-based file format to specify the reference alignments. Since SCOP and Sisyphus may refer to older PDB entries we mapped the chain id&#8217;s to PDB Nov. 2008. Additionally we provide PDB style files which are referenced in the xml-files. If you use the data-set you should use PDB files provided here. For details specific for a certain set please refer to the set specific pages.<\/p>\n<p style=\"text-align: justify;\">The xml format is used for pairwise and multiple alignments. Each alignment in turn may contain alternative solutions. A certain alternative alignment is written in a row format. Below we show an excerpt of a case from the RIPC set:<\/p>\n<div style=\"background: none repeat scroll 0% 0% #264d3c; width: 700px; text-align: justify;\">\n<pre><span style=\"color: #c0c0c0;\">&lt;?xml version=\"1.0\"?&gt;\r\n&lt;multiple-alignment n=\"2\" altalg=\"1\"&gt;\r\n  &lt;description&gt;\r\n    &lt;source&gt;RIPC v 1.0&lt;\/source&gt;\r\n    &lt;aname&gt;d1an9a1-d1npx_1&lt;\/aname&gt;\r\n  &lt;\/description&gt;\r\n  &lt;members&gt;\r\n    &lt;member&gt;d1an9a1&lt;\/member&gt;\r\n    &lt;member&gt;d1npx_1&lt;\/member&gt;\r\n  &lt;\/members&gt;\r\n  &lt;alternative id=\"1\" eqr=\"11\"&gt;\r\n    &lt;mequivalences n=\"11\"&gt;\r\n       &lt;row&gt;&lt;meq&gt;   6 :I:A&lt;\/meq&gt;&lt;meq&gt;   6 :L: &lt;\/meq&gt;&lt;\/row&gt;\r\n       &lt;row&gt;&lt;meq&gt;  37 :D:A&lt;\/meq&gt;&lt;meq&gt;  33 :K: &lt;\/meq&gt;&lt;\/row&gt;\r\n       &lt;row&gt;&lt;meq&gt;  47 :V:A&lt;\/meq&gt;&lt;meq&gt;  41 :S: &lt;\/meq&gt;&lt;\/row&gt;\r\n       .\r\n       .\r\n       .\r\n    &lt;\/mequivalences&gt;\r\n  &lt;\/alternative&gt;\r\n&lt;\/multiple-alignment&gt;\r\n<\/span><\/pre>\n<\/div>\n<table id=\"ASOverview\" style=\"height: 541px;\" border=\"0\" width=\"701\" cellspacing=\"0\">\n<tbody>\n<tr>\n<td style=\"border: none;\" width=\"15%\"><strong>Entity<\/strong><\/td>\n<td style=\"border: none;\" width=\"85%\"><strong>Meaning<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">multiple-alignment<\/td>\n<td style=\"border: none;\">Contains the alignment of a certain set of proteins. For the same set of proteins alternative solutions may exits (see &lt;alternative&gt;). E.g. <tt>&lt;multiple-alignment n=\"2\" altalg=\"1\"&gt; <\/tt>is an alignment of two (n=&#8221;2&#8243;) structures with a single solution (altalg=&#8221;1&#8243;).<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">description<\/td>\n<td style=\"border: none;\">Contains general information about the alignment.<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">members<\/td>\n<td style=\"border: none;\">Lists the name of the proteins\/domains used.<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">alternative<\/td>\n<td style=\"border: none;\">Encloses a certain alternative alignment solution.<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">mequivalences<\/td>\n<td style=\"border: none;\">The alignments are stored in a row format. The attribute <tt>n<\/tt> counts the number of rows. In the example we have (n=&#8221;11&#8243;) rows, three of them are shown.<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">row<\/td>\n<td style=\"border: none;\">Each row consists of as many &lt;meq&gt; entities as there are members in the member section. The order from left to right corresponds to the top to bottom order of the molecules in the member section.<\/td>\n<\/tr>\n<tr>\n<td style=\"border: none;\">meq<\/td>\n<td style=\"border: none;\">Each &lt;meq&gt; contains double colon ( : ) separated fields refering to pdb format ATOM\/HETATM records. The fields are:<i>&#8220;resSeq+iCode&#8221;<\/i>:<i>&#8220;residue type&#8221;<\/i>:<i>&#8220;chainId&#8221;<\/i>E.g. <tt>&lt;meq&gt; 315 :G:A&lt;\/meq&gt;<\/tt> refers to a glycine residue on position 315 (with blank iCode) in chain A. The field resSeq+iCode refers exactly to columns 23-27 in pdb ATOM\/HETATM records. In the provided dataset only structurally equivalent residues are shown. If required, gaps may be easily coded as <tt>&lt;meq&gt;---------&lt;\/meq&gt;<\/tt>.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify;\">We try to improve and extend the data sets and these web pages. Changes in the data or new versions will reported <a title=\"RSA Changes\" href=\"pbwww.services.came.sbg.ac.at\/?page_id=181\">here<\/a>. Feedback is highly appreciated. If you use the data sets please cite <a href=\"http:\/\/www.biomedcentral.com\/1472-6807\/7\/50\" target=\"_blank\" rel=\"noopener noreferrer\">[1]<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Evaluation of sequence and structure alignment methods requires reference alignments. Data-sets provided here represent such reference alignments. Part of these sets have been used in &#8220;Comparative analysis of protein structure alignments&#8221; (Mayr et al. [1]), where we have evaluated pairwise structure alignment methods according their consistency and accuracy. Since this work we have updated the &hellip; <a href=\"https:\/\/pbwww.services.came.sbg.ac.at\/?page_id=44\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Reference Pairwise and Multiple Structural Alignments<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/pages\/44"}],"collection":[{"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=44"}],"version-history":[{"count":14,"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/pages\/44\/revisions"}],"predecessor-version":[{"id":793,"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=\/wp\/v2\/pages\/44\/revisions\/793"}],"wp:attachment":[{"href":"https:\/\/pbwww.services.came.sbg.ac.at\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=44"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}