Supplementary Data for
Stone and Sidow, 2005. Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Research 15:978-986
Each protein considered in the manuscript names a folder that contains its data and analysis. Files common to most folders are tabulated below.
Protein_Alignment.fa The alignment in fasta format
The _Data.txt files are tab delimited and appear in one of two styles according to whether each row corresponds to an individual mutation or to one position in the protein. In the former case, there are three entries per row: the first entry names the position of the mutation, the second gives the lexicographic index of the mutation (A = 1, C = 3, etc.), and the third codes for the reported phenotype (see _Predictions.txt for decoding). In the latter case, each row has 25 entries, one for each of the letters A through Y. The first entry of a row records the coded phenotype of an (A)lanine substitution at the position that the row defines (if any, see _Data.txt). See the manuscript for appropriate references to each dataset.