A second related question concerned the relationship among the recombinant IHR group members. In particular, what could be concluded about the origin of the group given the observation from 2 independent studies that the members appear to form a well-defined cluster of genotypes? Third, we used the sequence data to examine the hypothesis that the introgressed DNA was from X. fastidiosa subsp. fastidiosa, the subspecies that causes Pierce’s disease. X. fastidiosa subsp. fastidiosa is native to Central America, and all known isolates in the United States and northern Mexico can be traced back to a single introduced genotype . IHR would be of limited interest if it simply randomized the genetic differences among the subspecies but had a minimal effect on pathogenesis. For this reason, we were particularly interested in documenting any possible invasion of new plant hosts associated with IHR. The hypothesis is that IHR creates a range of novel genotypes that are far more variable than can arise from a lineage diversifying through point mutations, and this diversity facilitates adaptive evolution of a kind not possible for a clonal lineage. This kind of probabilistic evolutionary hypothesis can rarely be directly proven based on an individual case; however, it makes predictions that, if generally supported, would cause the hypothesis to be accepted. In the case of X. fastidiosa, plastic planter pot compelling evidence supporting the hypothesis would be the invasion of a new native host plant that is uniquely associated with IHR. Our data support this hypothesis: in X. fastidiosa subsp. multiplex, IHR is indeed associated with the invasion of at least 2 new native plant hosts, blueberry and blackberry.
To investigate intersubspecific homologous recombination , we analyzed 31 isolates previously identified as IHR-type and 2 isolates previously identified as intermediate-type X. fastidiosa subsp. multiplex , based on sequence of the 7 housekeeping loci used in the MLST scheme defined by Yuan et al. plus a region of the pilU gene. Together, these 33 isolates made up the recombinant group. Details regarding the isolation and typing of the 33 isolates were provided by Nunney et al. , and a summary of salient features is provided in Table S1 in the supplemental material in that article. All sequences used have previously been published and are available both in GenBank and the MLST website . To detect IHR, we employed a modified version of the introgression test developed by Nunney et al. . In its original form, the test compares a set of target sequences, some of which may have been involved in IHR, to a set of potential donor sequences. Each variable site is classified as F, a fixed difference between the target sequences and the donor sequences, or P, a polymorphic site within the target sequences where at least one variant base is shared with the donor set. In the modified version of the test, the targeted introgression test, the target sequence is known a priori and is compared to two references, the donor group, D , and the ancestral group, A . The minimum number of nucleotide differences between the target and the two references defines a ratio of D to A equivalent to the ratio of F to P and can be tested in the same way . In some cases, there is no breakpoint because the whole locus appears to be an introgressed sequence . Although the signal of introgression across the entire sequenced region may be clear, it is valuable to have a statistical test that documents the strength of the signal. In this case, the null expectation is the ratio that reflects the pairwise differences between the donor and ancestral group versus the pairwise differences within the ancestral group .
We used this ratio to define the expectation of the D/A ratio for a chi-square test of complete introgression. Gene diversity and distance trees were calculated using MEGA5 , and the maximum parsimony tree was created using the PARS program in Phylip . Distance trees and the maximum parsimony tree were used rather than other methods, given the known occurrence of intersubspecific recombination in the data. ClonalFrame was used to provide an independent estimate of the relative importance of recombination versus mutation in the recombinant group.Based on 8 loci sequenced , Nunney et al. identified 9 sequence types belonging to the recombinant group of X. fastidiosa subsp. multiplex. These STs all showed evidence of intersubspecific homologous recombination at one or more of the 8 loci and were characterized by 18 alleles, 10 of which were never found in non-IHR X. fastidiosa subsp. multiplex strains . These 10 alleles were examined for evidence of IHR by comparing them to the previously described non-IHR X. fastidiosa subsp. multiplex alleles and to the known X. fastidiosa subsp. fastidiosa and sandyi alleles . Of these 10, 4 alleles were found to be derived in their entirety from X. fastidiosa subsp. fastidiosa, and 3 were found to be chimeric for X. fastidiosa subsp. multiplex and fastidiosa sequences, with significant evidence of one or more recombination breakpoints. These 7 alleles encompassed 4 loci: leuA, cysG, holC, and pilU. The locus most strongly implicated in IHR wascysG, since all of the 9 recombinant-group STs were characterized at this locus by 1 of 3 cysG alleles unique to the group. The involvement of IHR in the genesis of all 3 of these alleles is illustrated by their close genetic relationship to X. fastidiosa subsp.fastidiosa and sandyi alleles . Allele 12, apart from being found in the recombinant group, is an X. fastidiosa subsp. fastidiosa allele . The other two alleles were found to be chimeric: allele 18 contains a single recombinant region at the 3= end of 342 bp, while allele 6 has two short recombinant regions, one at the 5= end of at least 23 bp and another toward the 5= end of at least 35 bp .
The DNA sequence variation defining these patterns is shown in Table 2. The patterns seen in the DNA sequences of the 3 cysGalleles are consistent with the hypothesis of a single IHR that introgressed donor allele 12 into X. fastidiosa subsp. multiplex, followed by subsequent intrasubspecific recombination reintroducing X. fastidiosa subsp. multiplex sequence to create alleles 6 and 18 . There are no inconsistent sites, provided the 5= recombination breakpoint in allele 18 starts after position 71. Introgression of X. fastidiosa subsp. fastidiosa sequence into X. fastidiosa subsp. multiplexwas found in alleles at 3 other loci . In the case of pilU, 7 of the 9 recombinant STs carried either an allele identical to a known X. fastidiosa subsp. fastidiosa allele or 1 bp different from it . Allele 1 is an allele that characterizes most U.S. isolates as well as several STs found in Costa Rica, while allele 9 is unique to the recombinant group. The leuA locus has a single statistically significant recombinant allele, allele 4 . It differed by 2 bp from the X. fastidiosa subsp. fastidiosa allele 9 but by 8 bp from the most similar nonIHR X. fastidiosa subsp. multiplex allele. X. fastidiosa subsp. fastidiosa allele 9 could be the donor for its entirety , although if the recombination region started after site 10 but before position 520 , then only one site would be unexplained. That remaining site carries a base unique to this allele and is probably a novel mutation. If the recombination breakpoint was 3= of position 295 then X. fastidiosa subsp. fastidiosa allele 1 provides as good a match as allele 9 . Similarly, 30 litre plant pots holC allele 7 was also 8 bp different from the most similar non-IHR X. fastidiosa subsp. multiplex allele, providing clear evidence that the 5= end was derived from X. fastidiosa subsp. fastidiosa . The pattern can be explained if X. fastidiosasubsp. fastidiosa allele 19 is the donor of the 5= region ending somewhere between positions 183 and 286, since it leaves no inconsistent bases .Evaluation of the plausibility of a single initial IHR event is complicated by the possibility of subsequent intrasubspecific recombination both within the recombinant group and between the recombinant group and the dominant non-IHR X. fastidiosa subsp. multiplex strains. Plausible sets of recombination events were determined by creating a tree using maximum parsimony applied to the 10 8-locus genotypes . Using allele numbers as characters, there were 2 equally parsimonious trees, each with 14 steps. They differed only in the precise positioning of 22a ; however, assuming a basal introgression of pilU1, only the tree shown in Fig. 3 remained. The hypothetical donor and recipient genotypes were added to root the tree, with the tree dictating gltT3 in the ancestral recipient genotype.
The most parsimonious tree showed that the pattern of introgression was more complex than could result from a single IHR. There are four main events that illustrate this complexity. First, based on this tree, the grouping of STs 27, 28, and 40 is defined by the introgression of holC7, a recombinant allele introduced into the tree far from the basal recombination event. Second, although the mutation of pilU1 could explain the appearance of pilU9, a second introgression of pilU1 would be necessary to account for its appearance in STs 28 and 40. Third, a number of events are necessary to account for the evolution of the cysG locus. While cysG12, an X. fastidiosa subsp. fastidiosa allele introduced in the initial recombination event, could give rise to cysG18 by the introgression of X. fastidiosa subsp. multiplex sequence , this allele appears in two places in the tree, necessitating a lateral transfer within the recombinant group. Despite this complexity, the hypothesis of a single primary IHR event creating the founder of the recombinant group is strongly supported by the pattern seen at the cysGlocus. As noted above, all members of the recombinant group share one of 3 alleles that appear to be derived from a single introgression of donor allele 12. Analysis of theX. fastidiosa subsp. fastidiosa donor.The proposed X. fastidiosa subsp. fastidiosa donor is defined at 4 of the 8 loci: leuA9, cysG12, holC19, and pilU1. Of these 4 alleles, only pilU1 was found in an extensive genetic survey of 86 isolates of X. fastidiosa subsp. fastidiosa within the United States and northern Mexico . The results of this survey, combined with similar genetic data from Costa Rica, led to the conclusion that all isolates of X. fastidiosa subsp. fastidiosa found in North America were derived from a single ancestral strain introduced from Central America . Consistent with this hypothesis was the observation that, in the North American isolates, no allele at the 7 MLST loci orthe pilU locus was more than 1 bp different from the most common allele. Given this background, we can examine the hypothesis that the proposed ancestral donor is consistent with theX. fastidiosa subsp. fastidiosa strains currently found in the United States. Similarly, at leuA there is no inconsistency with U.S. allele 1 if the recombination breakpoint in the recombinant allele 4 was after position 295 . If the breakpoint is before that point, then Costa Rica allele 9 provides a better fit of only 1 bp, a minor difference. In marked contrast, the alleles cysG12 and holC19 have only been found in Costa Rica, not in the United States , and differ markedly from the U.S. alleles. In particular, within the IHR regions, the U.S. X. fastidiosa subsp. fastidiosa alleles cysG1 and holC1 are 5 and 7 bp different, respectively, from the recombinant group sequence, while the Costa Rica alleles precisely match the donor sequence . These large differences require us to reject the hypothesis that the primary X. fastidiosa subsp. fastidiosa donor was derived from the introduced genotype that was the ancestor of all of the North American X. fastidiosa subsp. fastidiosa isolates that have been typed. Estimating recombination rates in the recombinant group of X. fastidiosa subsp. multiplex. The prevalence of recombination over mutation in the evolution of the recombinant group was supported by a ClonalFrame analysis: the estimated ratio of recombination events to mutation was 19,310, with a 95% confidence lower bound of 45.3. Addition of the potential ancestor and/or potential donor genotypes to the analysis maintained high estimates of the lower bound of / . These lower bounds are high for a largely clonal organism, and they illustrate the pervasive involvement of recombination in the genesis of the recombinant group.