AliquotG: an improved heuristic algorithm for genome aliquoting.
AliquotG: An Improved Heuristic Algorithm for Genome Aliquoting Zelin Chen., Shengfeng Huang., Yuxin Li, Anlong Xu* State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Pharmaceutical Functional Genes, College of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, People’s Republic of China Abstract An extant genome can be the descendant of an ancient polyploid genome. The genome aliquoting problem is to reconstruct the latter from the former such that the rearrangement distance (i.e., the number of genome rearrangements necessary to transform the former into the latter) is minimal. Though several heuristic algorithms have been published, here, we sought improved algorithms for the problem with respect to the double cut and join (DCJ) distance. The new algorithm makes use of partial and contracted partial graphs, and locally minimizes the distance. Our test results with simulation data indicate that it reliably recovers gene order of the ancestral polyploid genome even when the ancestor is ancient. We also compared the performance of our method with an earlier method using simulation data sets and found that our algorithm has higher accuracy. It is known that vertebrates had undergone two rounds of whole-genome duplication (2R-WGD) during early vertebrate evolution. We used the new algorithm to calculate the DCJ distance between three modern vertebrate genomes and their 2R-WGD ancestor and found that the rearrangement rate might have slowed down significantly since the 2R-WGD. The software AliquotG implementing the algorithm is available as an open-source package from our website (http://mosas.sysu.edu.cn/genome/download_softwares.php). Citation: Chen Z, Huang S, Li Y, Xu A (2013) AliquotG: An Improved Heuristic Algorithm for Genome Aliquoting. PLoS ONE 8(5): e64279. doi:10.1371/ journal.pone.0064279 Editor: Marc Robinson-Rechavi, University of Lausanne, Switzerland Received July 5, 2012; Accepted April 15, 2013; Published May 14, 2013 Copyright: ß 2013 Chen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work was supported by project 31171193 (NSF), project 2008AA092601 (863), project 2011CB946101 (973), and projects from the Commission of Science and Technology of Guangdong Province and Guangzhou City. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: email@example.com . These authors contributed equally to this work. Some researchers have suggested that WGDs might increase rearrangement rates [13–17], whereas others have proposed the opposite . Despite the debate regarding whether polyploidization complicates the rearrangement process, polyploidization has posed challenges for the computation of rearrangement distance. One of the challenges is the genome aliquoting problem , which is to find the minimum number of rearrangement events required to transform a rearranged r–way polyploidized genome into a non–rearranged form and reconstruct the genome of the latter. The problem with r = 2 represents a special case called genome halving, which has been addressed several times, and for which a linear–time exact algorithm with respect to DCJ distance is available [20–23]. The more generalized case of the problem with r .2 has also been studied [19,24,25], including the heuristic algorithm developed by Warren and Sankoff . Their algorithm relies on the occurrence number of gene–gene adjacencies in the extant genome, which is an effective characteristic when the rearrangement distance is small and many gene–gene adjacencies retain more than two copies. However, in reality, many gene–gene adjacencies could appear only once and hence affect the precision of the algorithm. In this study, we introduce a new heuristic algorithm for the genome aliquoting problem with respect to DCJ distance. Our algorithm considers not only the multiplicities of edges in the CPG but also the number of black cycles and their local costs (i.e., Np Introduction Whole genome sequencing projects permit easy and accurate detection of genome rearrangement events by direct comparison of two genome sequences. To measure these events, Sankoff proposed the use of the edit distance in 1992, which is defined as the minimum number of rearrangement events necessary to transform one genome into another. Several types of edit distance have been proposed, including the reversal distance and the double cut and join (DCJ) distance . Pevzner et al. introduced reversal distance and developed a breakpoint graph–based, linear– time exact algorithm for computation [2–5]. Later, Yancopoulos et al. proposed the DCJ distance and its corresponding efficient calculation . DCJ differs from other edit distances in that it includes chromosomal fusion, fission, inversion, translocation and block interchange within a single model and allows simpler algorithms for calculation. In many species, such as in vertebrates [7,8], paramecia , yeasts  and many plants , the extant rearranged genome descended from an ancient form that underwent r–way polyploidization (or whole genome duplication, WGD, where r–way indicates that polyploidization generated r copies of the original genome). For example, two almost consecutive rounds of WGD (2R–WGD), or 4–way polyploidization, occurred upon the origin of vertebrates and are suggested to be responsible for the dramatic increase in the morphological complexity of vertebrates . PLOS ONE | www.plosone.org 1 May 2013 | Volume 8 | Issue 5 | e64279 AliquotG Figure 1. Representation of a genome, a PG and a CPG. (A) A rearranged duplicated genome with duplicated size of 3 is represented as a sequence of signed integers, where a positive (negative) sign is represented by the direction of the colored arrow. (B) The same genome is represented as a sequence of extremities (i.e., heads, tails or cap genes). (C) PG of the above genome. Each non-cap gene is cut into head and tail, which becomes two vertices in the partial graph. (D) The vertex reposition of the PG in (C). (E) The contracted PG that is converted from the PG showed in (C) and (D). Each vertex corresponds to an extremity family. Numbers on each edge indicate the copy IDs (i.e., subscripts) of the two extremities connected by the corresponding adjacency. Note that the edge (1h, 2t) in (E) corresponds to 2 adjacencies (or edges) in (C), so its multiplicity is 2. doi:10.1371/journal.pone.0064279.g001 paper, a gene family is represented by an unsigned integer that is called a (gene) family ID, while a gene is represented by its signed family ID with an integer subscript, where the sign (positive or negative) indicates direction of the gene in the genome, while the subscript, named copy ID, denotes different genes in the same family. We represent a genome as a set of linear chromosomes, in which each chromosome is represented by a sequence of genes (Figure 1A) beginning and ending with two special genes called cap genes. All cap genes form the cap family with family ID ‘0’. We can also represent a genome by replacing each non-cap gene +gi with an ordered pair, git gih , and by replacing 2gi with gih git (Figure 1B), where gih and git are called the head (ID) and tail (ID) of the gene gi, respectively, and both heads and tails (and the cap genes) are called extremity (ID). The heads or tails of genes within the same gene family also consist of a head family or tail family, both of which (along with the cap family) are called the extremity family (with the ID gt and gh, for example). The connection between two adjacent genes is defined as an adjacency. Moreover, we define three special genomes: (1) single copy genome, in which the size of each non-cap gene family is exactly one; (2) perfectly duplicated genome, which represents multiple copies of the single copy genome, and may descend from a single copy genome through WGD events; (3) rearranged duplicated genome, in which all non-cap gene families have exactly the same size, named duplicated size. To describe the genome aliquoting problem, a rearrangement operation (or model) is needed. Here we use the DCJ operation . Each DCJ operation cuts two adjacencies and rejoins the four free extremities differently to form two new adjacencies. If both of the original adjacencies connect a non-cap gene and a cap gene, and the two cap genes are from different chromosomes, the operation fuses the two chromosomes; and if one adjacency connects two cap genes and the other connects two non-cap genes, the operation breaks a chromosome into two. The DCJ distance and Cp in Methods), thus it can handle weak adjacencies. This algorithm performs well on extensive simulation data sets and is better than previous methods. It has been successfully applied to the 2R-WGD of vertebrates. Methods Problem Description Warren and Sankoff have introduced the genome aliquoting problem . Here, we revisit the problem and present the definitions and notations that are commonly used. The fundamental element in a genome is the gene, and several duplicated (or similar) genes form a gene family. The number of genes in a gene family represents the size of the family. In this Figure 2. Transforming paths to black cycles. Vertices u, v are matched. The upper path is (u, a, b, c, d, e, f, v), and the bottom path is (u, g, h, e, d, i, j, k, v). Four colored subpaths on the left are contracted into four edges in the right cycles, respectively. An odd path is transformed into an even cycle, while an even path is transformed into an odd cycle. Solid and dashed edges correspond to black and gray edges, respectively. doi:10.1371/journal.pone.0064279.g002 PLOS ONE | www.plosone.org 2 May 2013 | Volume 8 | Issue 5 | e64279 AliquotG Figure 3. Adjacency inference in H (i.e., gray dashed edges in graph B) from matched vertices u, v in graph A. Vertices a, b are matched vertices that have not been contracted. Vertices c, d are unmatched. Black solid edges are derived from PG(Gobs), CPG(Gobs) or contracted edges. The four different paths found by find_path(u, v) are as follows: (1) (u, v) (green), (2) (u, a, b, v) (blue), (3) (u, b, a, c, v) (orange), and (4) (u, d, v) (pink). In graph A, the first two paths are odd and form two even cycles – (u, v, u) and (u, a, b, v, u) – by adding the gray edge (u, v) in the top right panel. The former disappears after contraction, while the latter generates a new black edge (a, b) in the bottom left panel. The last two paths are merged by two same gray edges (u, v) to form an even cycle – (u, b, a, c, v, u, d, v, u) – that is contracted and generates two new black edges, (b, d) and (c, d). Two numbers on each edge of graph A indicate the copy IDs of the two corresponding vertices in graph B. doi:10.1371/journal.pone.0064279.g003 between two genomes is defined as the minimum number of DCJ operations necessary to transform one genome into the other . Now, the genome aliquoting problem is defined as follows. Given a rearranged duplicated genome (Gobs, ’obs’ stands for observed) with duplicated size r ($2), reconstruct a perfectly duplicated genome (Gdup) such that the DCJ distance between them is minimal . Additionally, we can also obtain the corresponding single copy genome (Ganc) of Gdup. It remains an open question whether an efficient exact solution exists for the genome aliquoting problem with respect to DCJ distance . Here, we present a new heuristic algorithm to solve the problem. vertex, identified by the corresponding family ID, and (2) each original edge, taking (1h3 ,2t2 ) for example, is transformed into a new edge (1h, 2t) with multiplicity 1. In this latter step, if the edge already exists then the multiplicity of that edge is increased by 1 every time. Any edge with multiplicity 0 is removed. The new graph is called contracted partial graph (CPG). Figure panels 1C to 1E show an example of the process transforming a PG into a CPG. We define an edge in CPG(Gobs) (i.e., the CPG of Gobs) strong if its multiplicity is greater than one, otherwise the edge is called weak. The corresponding adjacency of a strong (weak) edge is also called strong (weak) adjacency. Because every edge in CPG(Gdup) must have multiplicity r (i.e., the duplicated size of Gobs), a strong edge in CPG(Gobs) with higher multiplicity (.1) represents the true original edge in CPG(Gdup) more likely than weak edges. We initialize graph A as CPG(Gobs) and graph B as PG(Gobs) (i.e., the PG of Gobs), with all edges colored black. The solution iterates the following first two steps and then goes to Step 3. Solutions Our solution makes use of the partial graph (PG). A PG of a genome is a graph in which each vertex corresponds to an extremity of the genome, and each edge corresponds to an adjacency (Figure 1C). All of the above concepts regarding extremities are also suitable for vertices, e.g., the subscript of vertex is also called the ‘copyID’ of the vertex. PGs of two genomes with same gene contents can form a bicolor-edge graph (i.e., breakpoint graph ), one color for each genome, which is used to calculate the DCJ distance between the two genomes. Minimizing DCJ distance is the same as maximizing the number of alternating (color) cycles in the bicolor graph . A PG can be transformed into a new graph through the following two steps: (1) contract all vertices of the same extremity family into a single new PLOS ONE | www.plosone.org Step 1: Infer Strong Adjacencies Taking multiplicity as edge weight by applying the maximum weight-matching algorithm to all strong edges of graph A, we first obtain a matching, which contains a set of pairwise non-adjacent strong edges. For each edge (u, v) in the matching, we add a gray edge (u, v) with the multiplicity r (i.e., the duplicated size of Gobs) into graph A. 3 May 2013 | Volume 8 | Issue 5 | e64279 AliquotG Figure 4. Reconstructing Gdup from the genome (Gobs) of Figure 1A. This figure shows how graph A changes during the solution. In this simple example, step 2 is not necessary, but here we still calculate the weights Np, Cp and use step 2 to infer the last two adjacencies (3h1 ,4t3 ), (3h3 ,4t2 ), (3h2 ,4t1 ), (1t1 ,4h1 ), (1t2 ,4h2 ) and (1t3 ,4h3 ) of Gdup (or H) ahead of linearizing circular chromosomes, to show how step 2 works. Different vertex colors indicate different gene families. Edges with multiplicity k are represented as k edges for a clear display. The numbers on each edge are the copy IDs of the two vertices incident to the corresponding edge in graph B. Blue numbers are the weights Np, Cp for the pair of vertices linked by gray dotted edge. doi:10.1371/journal.pone.0064279.g004 Secondly, for each (strong) edge (u, v) in the matching, we add r pairwise non-adjacent gray edges (ui, vj) into graph B. To know which ui (i = 1, NNN, r) and vj (j = 1, NNN, r) should be paired, we use the PLOS ONE | www.plosone.org following local greedy method. We search for a set of paths from u to v in graph A with minimum total length (in terms of number of edges) in all path sets that satisfy the following four conditions: (1) 4 May 2013 | Volume 8 | Issue 5 | e64279 AliquotG Figure 5. Application of the heuristic algorithm to four simulation datasets. Each light blue point in each panel corresponds to a data point. Green line: x = y. Inner axis labels represent DCJ distance, whereas outer labels show relative DCJ distance. The Y value of the dark blue dot is 0.1 (relative DCJ distance) in each plot, where n is the number of gene families and r is the duplicated size of Gobs, and the blue numbers (DCJ distance and relative DCJ distance) below each blue cycle represent the corresponding X values. Note that the inferred and simulated distances are almost the same when the relative simulated distance is smaller than 0.4. The distance is less than 0.1, until the relative simulated distance increases to approximately 0.6. The relative DCJ distance between simulated and inferred Gdup is small when the simulated distance is smaller than 0.25 (r = 3) or 0.33 (r = 4). doi:10.1371/journal.pone.0064279.g005 incident to u2 and another black edge (v1, b2) incident to v1. These three edges (two black edges and one gray edge) are contracted into a new black edge (a3, b2), and the multiplicity of the edge (a, b) is increased by a factor of one in graph A (Figure 3, bottom left panel, thick edge). The new graph B is contracted into a new (contracted partial) graph A. At last, we add all adjacencies (ui, vj) into genome H (which is empty initially), corresponding to the above gray edges (ui, vj) in graph B. each path in the set must start and end with black edges; (2) each path must visit and leave a matched vertex (i.e., vertex in the matching) through two edges with different colors; (3) an edge cannot be passed more times than its multiplicity; (4) each unmatched vertex (i.e., vertex not in the matching) can be visited at most once by at most one path in the set. This routine is declared as find_path(u, v) and is implemented using a modified Suurballe’s algorithm . Each path in the set and the gray edge (u, v) will form a cycle with a length that is the length of the path plus one, so an odd path will form an even cycle and vice versa. Because of the path conditions (1) and (2), any gray edge in the path is contained in some alternating black-gray subpath whose first and last edges are black, therefore each subpath (e.g., (c, d, e, f) in Figure 2) can be contracted into a single black edge (e.g., (c, f) in Figure 2). Therefore, all gray edges are removed, and all edges in the cycle will be black after being contracted with probable Meniere’s were selected among patients referred to the outpatient clinic of Amir-Aalam hospital between the years 2012 and 2015. Ninety-one healthy controls were selected from normal healthy subjects from the same region. The control group underwent history analysis and a physical exam to confirm their health status. We matched our control group for sex and age with both groups. Audiological, vestibular, and functional status in each patient was evaluated based on the American Academy of OtolaryngologyHead and Neck Surgery guidelines (14). The diagnosis was confirmed in all patients by ruling out the presence of anatomic lesions using magnetic resonance imaging (MRI). All the participants were requested to sign written informed consent. Patients didn’t pay for the complementary tests and their identity remained hidden throughout the study. HLA typing After taking 5cc of blood from each patient, samples were collected in EDTA tubes with anticoagulating agent stored at 20*C. The DNA was extracted using salting out method and HLA-Cw typing was performed using sequence specific primers (SSP) polymerase chain reaction [DynalAllset+TM SSP Kit] (15). Statistical analysis HLA-Cw Allele frequencies were calculated for each group and were compared using the pearson’s chi-square test and fischer exact test. The significance of association between HLA-Cw Allele and Meniere’s disease was estimated by odds ratio (OR) and 95% confidence intervals (95% CI) using the stata software (version8). P values less than 0.05 were considered to be statistically significant. 262 Iranian Journal of Otorhinolaryngology, Vol.28(4), Serial No.87, Jul 2016 HLA-CW Alleles in Meniere disease Results The clinical features of the patients are shown in table 1. HLA-Cw allele frequencies were determined in all patients. The distributions of HLA-Cw frequencies were compared among Table1: Clinical characteristics of the patients. Sex (M/F) Mean age Onset time (Years) Unilateral/bilateral Level of Mild hearing Moderate Severe Total number of patients *Meniere’s Disease Definite MD* 13/10 34.96±13.047 (13-71) 3.74±3.107 (1-12) 21/1 13 8 2 23 different groups. There was a statistically significant difference in the frequency of HLA-Cw*04 in patients with definite or probable Meniere’s disease (18%) and healthy controls (1%) (P=0.001) (Table.2). Probable MD 17/7 41.04±16.815 (15-76) 7.63±7.033 (1-24) 19/5 9 5 9 24 Control 51/40 38.53±10.513(14-78) N/A N/A N/A N/A N/A 91 Table 2: Distribution of HLA-Cw Alleles in patients and controls. HLA-C MD* patients (n=47) (%) # Control group (n=91) (%) # P value allele OR (CI 95%) Cw*01 Cw*02 Cw*03 Cw*04 Cw*06 Cw*07 Cw*08 Cw*12 Cw*13 Cw*14 Cw*15 Cw*16 Cw*17 Cw*18 Total *Meniere’s disease 2 (2.1%) 0 (0%) 2 (2.1%) 17 (18.1%) 10 (10.6%) 13 (13.8%) 4(4.3%) 8(19.1%) 0 (0%) 7(7.4%) 1(1.1%) 17(18.1%) 3(3.2%) 0(0%) 94(100%) 7 (3.8%) 7 (3.8%) 9 (4.9%) 2 (1.1%) 33 (18.1%) 25 (13.7%) 4(2.2%) 43(23.6%) 3(1.6%) 17(9.3%) 0(0%) 9(4.9%) 7(3.8%) 16(8.8%) 182(100%) 0.72 0.09 0.34 <0.0001 0.10 0.98 0.45 0.39 0.55 0.59 0.34 <0.0001 1.00 0.002 0.543(0.111-2.669) 0.418(0.88-1.975) 19.870(4.481-88.101) 0.538(0.252-1.145) 1.008(0.490-2.074) 1.978(0.483-8.092) 0.766(0.413-1.419) 0.781(0.312-1.955) 4.244(1.811-9.943) 0.824(0.208-3.263) HLA-Cw*16 allele frequency was also significantly higher in patients with definite or probable Meniere’s disease compared to the controls. [p: 0.0001, OR: 4.244, 95% CI(1.81-9.94)] (Table.2). On the other hand, HLA-Cw *18 allele frequency was significantly decreased in patients compared to the control group (P=0.002) (Table. 2). Frequencies of HLA-Cw alleles showed significant differences between probable Meniere’s disease and definite Meniere’s disease group. There was an increase in the frequency of HLA-Cw*12 in probable Meniere’s disease group compared to definite Meniere’s disease group (P=0.046, OR; 0.328, 95% CI (0.107- 1.012)] (Table. 3). Iranian Journal of Otorhinolaryngology, Vol.28(4), Serial No.87, Jul 2016 263 Dabiri S, et al Table 3: Distribution of HLA-Cw Alleles in definite MD, probable MD patients, and controls. HLA-C allele Definite MD* patients (n=23) (%)# Probable MD* patients (n=24) (%)# Control group (n=91) (%) # P value (Definite vs. probable) P value (Definit vs. Control) P value (probable vs. Control) Cw*01 Cw*02 Cw*03 Cw*04 2 (4.3%) 0(0%) 1 (2.2%) 8(17.4%) 0(0%) 0(0%) 1(2.1%) 9(18.8%) 7 (3.8%) 7 (3.8%) 9 (4.9%) 2 (1.1%) 0.23 N/A >0.99ǂ 0.86 >0.99 0.35ǂ 0.69ǂ 10-4 0.35 0.35 0.69ǂ 10-4 Cw*06 7(15.2%) 3(6.3%) 33 (18.1%) 0.15 0.64 0.46 Cw*07 Cw*08 8(17.4%) 0(0%) 5(10.4%) 4(8.3%) 25 (13.7%) 4(2.2%) 0.32 0.11ǂ 0.52 0.58ǂ 0.54 0.06ǂ Cw*12 Cw*13 5(10.9%) 0(0%) 13(27.1%) 0(0%) 43(23.6%) 3(1.6%) 0.04 N/A 0.05 >0.99ǂ 0.62 >0.99ǂ Cw*14 Cw*15 3(6.5%) 1(2.2%) 4(8.3%) 0(0%) 17(9.3%) 0(0%) >0.99 0.48ǂ 0.77 0.48ǂ >0.99 N/A Cw*16 Cw*17 Cw*18 9(19.6%) 2(4.3%) 0(0%) 8(16.7%) 1(2.1%) 0(0%) 9(4.9%) 7(3.8%) 16(8.8%) 0.71 0.61ǂ N/A 0.001 >0.99ǂ 0.04ǂ >0.99 0.02ǂ 0.06ǂ Total #46(100%) #48(100%) #182(100%) *Meniere’s Disease, ǂ analyzed with fisher exact test, others are analyzed using chi square #The number of alleles is twice the number of patients, as each person has two alleles—one maternal and one paternal. Frequencies of HLA-Cw*04 (P<0.001) and HLA-Cw*16 (P=0.001) were found to be significantly higher in definite Meniere’s disease patients compared to the control subjects. Though, the allele frequency of HLA-Cw*18 was significantly lower in definite Meniere’s disease compared to the controls (P= 0.047) (Table. 3). Discussion The frequency of HLA-Cw alleles in 3 groups of definite Meniere’s disease patients, probable Meniere’s disease patients, and healthy controls has been appraised in this study. A significant association between HLA-Cw*04 and HLA-Cw*16 alleles with both probable and definite Meniere’s disease was observed in our patients. This was in agreement with the study of Koyama and colleagues in which they found significant association between HLA-Cw4 and HLADRB1*1602 in a Japanese sample. (16) Khorsandi et al. also found that HLA-Cw4 is found more in Iranian definite Meniere’s disease patients (13). The frequency of HLA-Cw*18 allele was significantly higher in the control group. HLA-Cw*12 allele frequency was significantly higher in probable Meniere’s disease group compared to the patients with definite Meniere’s. The observed association of HLA-Cw alleles in our patients suggested an autoimmune mechanism in both definite and probable Meniere’s disease. This is in agreement with Greco’s study who discussed Meniere’s autoimmunity in literature (4). A mechanism based approach to the disease diagnosis and treatment warrants more accurate diagnostic methods and more straightforward and powerful treatment approaches. Earlier studies have identified an association between the HLA-Cw7 antigen and Meniere’s disease in a British population (17). A significant increase in the frequency of HLA-Cw7 alleles in patients with Meniere’s disease was observed compared to other inner ear diseases and healthy subjects (18). Some studies have found an association between HLA-DRB1 alleles and Meniere’s disease. However the results had not been repeated in other populations (19,20). Yeo et al. have found that frequency of HLA-Cw*0303 and HLA-Cw*0602 alleles 264 Iranian Journal of Otorhinolaryngology, Vol.28(4), Serial No.87, Jul 2016 HLA-CW Alleles in Meniere disease were significantly increased in Korean patients with MD compared to healthy controls; on the other hand, HLA-B44 and HLA-Cw*0102 were significantly decreased in their patients (19). This can support the idea of the association between probable Meniere’s disease and the HLA Alleles. In another study on 80 Mediterranean patients with bilateral Meniere’s disease, an association between HLA-DRB1 *1101 and bilateral Meniere’s disease was observed; but there was no association between HLACw and bilateral Meniere’s disease (20). The number of patients with bilateral Meniere’s disease was very low in our studied population. The discrepancies in results are probably due to the ethnic or geographic background or also might be as a result of differences in phenotypic or clinical manifestations of patients in different studies which require careful interpretation. To the best of our knowledge, this is the first study which compares HLA-Cw allele frequencies in definite Meniere’s disease and probable Meniere’s disease patients in an Iranian sample. Some HLA loci are inherited as haplotype blocks. Therefore the association found between HLA-Cw and Meniere’s disease in our study might be due to an association with another unknown gene in linkage with the 75–85. 4. Bachmann F (1994) Molecular aspects of plasminogen, plasminogen activators and plasmin. In: Bloom AL, Forbes CD, Thomas DP, Tuddenham EGD, eds. Haemostasis and thrombosis. U.K.: Chuchill Livingston. pp 525–600. 5. Castellino FJ, Powell JR (1981) Human plasminogen. Methods Enzymol 80 Pt C: 365–378. 6. Estelles A, Aznar J, Gilabert J (1980) A qualitative study of soluble fibrin monomer complexes in normal labour and abruptio placentae. Thromb Res 18: 513–519. 7. Aznar J, Gilabert J, Estelles A, Parrilla JJ (1980) Evaluation of plasminogen and other fibrinolytic parameters in the amniotic fluid. Thromb Haemost 43: 182. 8. Gonias SL, Hembrough TA, Sankovic M (2001) Cytokeratin 8 functions as a major plasminogen receptor in select epithelial and carcinoma cells. Front Biosci 6: D1403–D1411. 9. Dano K, Behrendt N, Hoyer-Hansen G, Johnsen M, Lund LR, Ploug M, Romer J (2005) Plasminogen activation and cancer. Thromb Haemost 93: 676–681. 10. Pancholi V, Fischetti VA (1998) a-enolase, a novel strong plasmin(ogen) binding protein on the surface of pathogenic streptococci. J Biol Chem 273: 14503–14515. 11. Bergmann S, Rohde M, Preissner KT, Hammerschmidt S (2005) The nine residue plasminogen-binding motif of the pneumococcal enolase is the major cofactor of plasmin-mediated degradation of extracellular matrix, dissolution of fibrin and transmigration. Thromb Haemost 94: 304–311. 12. Sun H, Ringdahl U, Homeister JW, Fay WP, Engleberg NC, Yang AY, Rozek LS, Wang X, Sjobring U, Ginsburg D (2004) Plasminogen is a critical host pathogenicity factor for group A streptococcal infection. Science 305: 1283–1286. 13. Epple G, Schleuning WD, Kettelgerdes G, Kottgen E, Gessner R, Praus M (2004) Prion protein stimulates tissue-type plasminogen activator-mediated plasmin generation via a lysine-binding site on kringle 2. Journal of Thrombosis and Haemostasis 2: 962–968. 14. Praus M, Kettelgerdes G, Baier M, Holzhutter HG, Jungblut PR, Maissen M, Epple G, Schleuning WD, Kottgen E, Aguzzi A, Gessner R (2003) Stimulation of plasminogen activation by recombinant cellular prion protein is conserved in the NH2-terminal fragment PrP23-110. Thromb Haemost 89: 812–819. 15. Salmona M, Capobianco R, Colombo L, De LA, Rossi G, Mangieri M, Giaccone G, Quaglio E, Chiesa R, Donati MB, Tagliavini F, Forloni G (2005) Role of plasminogen in propagation of scrapie. J Virol 79: 11225–11230. 16. Wiman B, Wallen P (1975) Proceedings: On the primary structure of human plasminogen and plasmin, cyanogen bromide fragments. Thromb Diath Haemorrh 34: 350. 17. Wiman B, Wallen P (1975) On the primary structure of human plasminogen and plasmin. Purification and characterization of cyanogen-bromide fragments. Eur J Biochem 57: 387–394. 18. McClintock DK, Bell PH (1971) The mechanism of activation of human plasminogen by streptokinase. Biochem Biophys Res Commun 43: 694–702. 19. Reddy KN, Markus G (1972) Mechanism of activation of human plasminogen by streptokinase. Presence of active center in streptokinase-plasminogen complex. J Biol Chem 247: 1683–1691. 20. Schick LA, Castellino FJ (1974) Direct evidence for the generation of an active site in the plasminogen moiety of the streptokinase-human plasminogen activator complex. Biochem Biophys Res Commun 57: 47–54. 21. Miles LA, Castellino FJ, Gong Y (2003) Critical role for conversion of gluplasminogen to lys-plasminogen for optimal stimulation of plasminogen activation on cell surfaces. Trends Cardiovasc Med 13: 21–30. 22. Horrevoets AJG, Pannekoek H, Nesheim ME (1996) A Steady-state Template Model That Describes the Kinetics of Fibrin- stimulated [Glu1]- and [Lys78]Plasminogen Activation by Native tissue- type Plasminogen Activator the past . SDS-PAGE, on the precipitated plasminogen, was carried out as previously described [46,47] using 7% gels. The supernates were quite dilute (%0.2 mg/mL) and were not monitored on gels. Author Contributions Conceived and designed the experiments: JAK. Performed the experiments: JAK. Analyzed the data: JAK. Contributed reagents/materials/ analysis tools: JAK. Wrote the paper: JAK. and Variants That Lack Either the Finger or Kringle-2 Domain. J Biol Chem 272: 2183–2191. 23. Carter DM, Kornblatt JA (2005) Interactions between canine plasminogen and 8-anilino-1-naphthalene sulfonate: structural insights from a fluorescent probe. Cell Mol Biol (Noisy. -le-grand) 51 Suppl: OL755–OL765. 24. Alkjaersig N, Fletcher AP, Sherry S (1959) vi-Aminocaproic acid: an inhibitor of plasminogen activation. J Biol Chem 234: 832–837. 25. Marshall JM, Brown AJ, Ponting CP (1994) Conformational studies of human plasminogen and plasminogen fragments: evidence for a novel third conformation of plasminogen. Biochemistry 33: 3599–3606. 26. Kornblatt JA, Schuck P (2005) Influence of temperature on the conformation of canine plasminogen: an analytical ultracentrifugation and dynamic light scattering study. Biochemistry 44: 13122–13131. 27. Olivari S, Molinari M (2007) Glycoprotein folding and the role of EDEM1, EDEM2 and EDEM3 in degradation of folding-defective glycoproteins. FEBS Lett 581: 3658–3664. 28. Sawyer JT, Lukaczyk T, Yilla M (1994) Dithiothreitol treatment induces heterotypic aggregation of newly synthesized secretory proteins in HepG2 cells. J Biol Chem 269: 22440–22445. 29. Yilla M, Doyle D, Sawyer JT (1992) Early disulfide bond formation prevents heterotypic aggregation of membrane proteins in a cell-free translation system. J Cell Biol 118: 245–252. 30. Ptitsyn OB (1995) How the molten globule became. Trends Biochem Sci 20: 376–379. 31. Stryer L (1965) The interaction of a naphthalene dye with apomyoglobin and apohemoglobin. A fluorescent probe of non-polar binding sites. J Mol Biol 13: 482–495. 32. Daniel E, Weber G (1966) Cooperative effects in binding by bovine serum albumin. I. The binding of 1-anilino-8-naphthalenesulfonate. Fluorimetric titrations. Biochemistry 5: 1893–1900. 33. Weber G, Daniel E (1966) Cooperative effects in binding by bovine serum albumin. II. The binding of 1-anilino-8-naphthalenesulfonate. Polarization of the ligand fluorescence and quenching of the protein fluorescence. Biochemistry 5: 1900–1907. 34. Kornblatt JA, Rajotte I, Heitz F (2001) Reaction of canine plasminogen with 6-aminohexanoate: a thermodynamic study combining fluorescence, circular dichroism, and isothermal titration calorimetry. Biochemistry 40: 3639–3647. 35. Kahn PC (1979) The interpretation of near-ultraviolet circular dichroism. Methods Enzymol 61: 339–378. 36. Okamoto S, Oshiba S, Mihara H, Okamoto U (1968) Synthetic inhibitors of fibrinolysis: in vitro and in vivo mode of action. Ann N Y Acad Sci 146: 414–429. 37. Castellino FJ, Brockway WJ, Thomas JK, Liano HT, Rawitch AB (1973) Rotational diffusion analysis of the conformational alterations produced in plasminogen by certain antifibrinolytic amino acids. Biochemistry 12: 2787–2791. 38. Kornblatt JA (2000) Understanding the fluorescence changes of human plasminogen when it binds the ligand, 6-aminohexanoate: a synthesis. Biochim Biophys Acta 1481: 1–10. 39. Cockell CS, Marshall JM, Dawson KM, Cederholm-Williams SA, Ponting CP (1998) Evidence that the conformation of unliganded human plasminogen is maintained via an intramolecular interaction between the lysine-binding site of kringle 5 and the N-terminal peptide. Biochem J 333(Pt 1): 99–105. 40. Abad MC, Arni RK, Grella DK, Castellino FJ, Tulinsky A, Geiger JH (2002) The X-ray crystallographic structure of the angiogenesis inhibitor angiostatin. J Mol Biol 318: 1009–1017. 41. Chang Y, Mochalkin I, McCance SG, Cheng B, Tulinsky A, Castellino FJ (1998) Structure and ligand binding determinants of the recombinant kringle 5 domain of human plasminogen. Biochemistry 37: 3258–3271. 42. Mulichak AM, Tulinsky A, Ravichandran KG (1991) Crystal and molecular structure of human plasminogen kringle 4 refined at 1.9-A resolution. Biochemistry 30: 10576–10588. 43. Wang X, Terzyan S, Tang J, Loy JA, Lin X, Zhang XC (2000) Human plasminogen catalytic domain undergoes an unusual conformational change upon activation. J Mol Biol 295: 903–914. PLoS ONE | www.plosone.org 7 July 2009 | Volume 4 | Issue 7 | e6196 Reduction of Plasminogen 44. Getz EB, Xiao M, Chakrabarty T, Cooke R, Selvin PR (1999) A comparison between the sulfhydryl reductants tris(2-carboxyethyl)phosphine and dithiothreitol for use in protein biochemistry. Anal Biochem 273: 73–80. 45. Cleland WW (1964) Dithiothreitol, A New protective reagent for SH Groups. Biochemistry 3: 480–482. 46. Kornblatt JA, Marchal S, Rezaei H, Kornblatt MJ, Balny C, Lange R, Debey MP, Hui Bon HG, Marden MC, Grosclaude J (2003) The fate of the prion protein in the prion/plasminogen complex. Biochem Biophys Res Commun 305: 518–522. 47. Laemmli UK (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227: 680–685. PLoS ONE | www.plosone.org 8 July 2009 | Volume 4 | Issue 7 | e6196