Previous Article | Next Article ![]()
Journal of Virology, April 2008, p. 3204-3219, Vol. 82, No. 7
0022-538X/08/$08.00+0 doi:10.1128/JVI.02257-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Laboratory of Clinical and Epidemiological Virology, Department of Microbiology and Immunology, Rega Institute for Medical Research, University of Leuven, Leuven, Belgium,1 Vaccine and Biologics—Clinical Research, Merck and Co. Inc., North Wales, Pennsylvania 19454,2 Laboratory of Infectious Diseases, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892,3 Environment and Biotechnology Centre, Faculty of Life and Social Sciences, Swinburne University of Technology, Hawthorn, Victoria, Australia,4 Enteric Virus Unit, Virus Reference Department, Centre for Infections, Health Protection Agency, London, United Kingdom,5 Laboratory of Virology, ICDDR,B: Mohakhali, Dhaka 1212, Bangladesh6
Received 18 October 2007/ Accepted 8 January 2008
|
|
|---|
|
|
|---|
Due to the lack of proper immunological reagents and the increasing ease of sequencing, serotyping is being complemented with genotyping, which is based on identities between sequences of cognate rotavirus gene segments. So far, 15 G genotypes (14 G serotypes) have been identified, and out of 27 different P genotypes, 14 P serotypes (1A, 1B, and 2 to 14) have been identified with available VP4-specific antibodies (5, 30, 34-37, 42, 62, 68). Traditionally, a cutoff value of 89% VP7 amino acid sequence identity has been used to classify G genotypes, yielding a nearly complete concordance with the different G serotypes (16, 57). In contrast, the 89% amino acid identity cutoff value for VP4, established by Gorziglia and colleagues (18), does not result in an absolute concordance between different P genotypes and P serotypes. Specifically, P serotypes have not been defined for approximately half of the P genotypes, which are designated by an Arabic numeral between square brackets (16). Molecular analyses of VP6 is limited to only a 379-bp fragment of VP6, which results in two broad genogroups that do not correlate with the SG specificities (27). The classification of rotavirus nonstructural proteins is limited to NSP4, and six genotypes (A to F) have been recognized based on clustering patterns in amino-acid-based phylogenetic dendrograms (9, 22, 26, 46). To date, no classification for VP1 to VP3, NSP1 to NSP3, or NSP5 has been described.
RNA-RNA hybridization assays have been used to analyze and compare complete genomes of group A rotaviruses. For human strains, three genogroups have been established: two major genogroups represented by the reference strains Wa and DS-1 and one minor genogroup represented by reference strain AU-1 (51). Similar genogroups also have been established for several animal rotavirus strains although they are more complex (50). The hybridization technique has proven to be useful to investigate possible reassortment events between human strains belonging to different genogroups (49, 76) or between human and animal strains (48). Currently, rotavirus strains are being analyzed and compared to one another by partial or complete sequencing of all 11 gene segments as this approach allows direct determination of genetic relationships (38-40, 63, 67). In addition, sequencing of rotavirus genomes is critical to the understanding of phylogenetic analyses and to the elucidation of the patterns of virus evolution. One method that is used to study typical evolutionary distances between virus strains is pairwise sequence identity profiles (2). Specifically, this method illustrates virus genotypes as well-resolved peaks, providing the basis for classification systems. To properly study the evolution of rotaviruses, the establishment of a classification system in which individual genes fall into defined clusters/genotypes based on reliable percentage identity cutoff values is fundamental. Such a classification system could be an important tool to elucidate how rotaviruses evolve over time. In addition, for viruses with segmented genomes, this type of classification system could also be used to determine whether or not certain rotavirus genes cosegregate during reassortment events (gene linkage) or whether certain gene constellations play a role in rotavirus host range restriction or virulence.
In this study, phylogenetic analyses and pairwise sequence identity profiles were constructed for each of the gene segments of group A rotaviruses to develop a uniform classification and nomenclature system for all 11 rotavirus genome segments in a similar fashion to that established for VP4 and VP7. Based on percentage identity cutoff values, this novel classification system illustrates phylogenetic relationships of all 11 rotavirus genome segments and allows the identification of distinct genotypes (which likely followed separate evolutionary paths) and reassortment events. In addition, the comprehensive classification system revealed genetic relationships among rotaviruses from different host species, including evidence that human rotaviruses belonging to the Wa-like genogroup have a common origin with porcine rotaviruses while those belonging to the DS-1-like genogroup have a common origin with bovine rotaviruses.
|
|
|---|
RNA extraction, reverse transcription-PCR (RT-PCR), and sequencing of human rotavirus strains Wa, DS-1, S2, AU-1, YO, Hosokawa, and B3458 and bovine rotavirus strain NCDV. For each strain 140 µl of cell culture supernatant (or fecal suspension in the case of strain B3458) was used to extract viral RNA using a QIAamp Viral RNA mini kit (Qiagen/Westburg, Leusden, The Netherlands) according to the manufacturer's instructions.
Ten microliters of extracted RNA was denatured at 97°C for 3 min and RT-PCR was carried out using a Qiagen One Step RT-PCR kit (Qiagen/Westburg). The primers used are shown in Data S1 in the supplemental material. The RT-PCR was carried out with an initial reverse transcription step at 45°C for 30 min; PCR activation was at 95°C for 15 min, followed by 40 cycles of amplification (45 s at 94°C, 45 s at 45°C, and 6 min at 68°C), with a final extension of 7 min at 72°C in a GeneAmp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, CA).
The PCR amplicons were purified with a QIA Quick PCR purification kit (Qiagen/Westburg) and sequenced using the dideoxynucleotide chain termination method with an ABI Prism BigDye Terminator Cycle Sequencing Reaction kit (Perkin-Elmer) on an ABI Prism 3100 automated sequencer (Perkin-Elmer). The sequencing was performed with the forward and reverse primers used for the RT-PCR. In addition, primer walking sequencing was performed to cover the complete sequence of a segment on both strands.
Determination of the 5' and 3' terminal sequences of human rotavirus strains Wa, DS-1, S2, AU-1, YO, Hosokawa, and B3458 and bovine rotavirus strain NCDV. To obtain the complete nucleotide sequence of each segment, the 5' and 3' terminal sequences of the 11 gene segments were determined using a modified version of the single-primer amplification method, as described previously (38).
RNA and protein sequence analyses of human rotavirus strains Wa, DS-1, S2, AU-1, YO, Hosokawa, and B3458 and bovine rotavirus strain NCDV. The chromatogram sequencing files were analyzed using Chromas 2.23 (Technelysium, Queensland, Australia), and contigs were prepared using SeqMan II (DNAstar, Madison, WI). Nucleotide and protein sequence identity searches were performed using the National Center for Biotechnology Information (National Institutes of Health, Bethesda, MD) BLAST (Basic Local Alignment Search Tool) server on GenBank database release 154.0 (1). Multiple sequence alignments were calculated using ClustalX, version 1.81 (71). Sequences were manually edited in the GeneDoc, version 2.6.002, alignment editor (55).
In vitro transcription, PCR, cloning, and sequencing of the bovine rotavirus strains BRV033 and WC3 and porcine rotavirus strains A131, A411, and A253. Single-stranded RNA transcripts were prepared from purified double-layered BRV033, WC3, A131, A411, or A253 rotavirus particles as described previously (10). Briefly, reverse transcription was used to generate gene 1, 2, 3, 5, 7, 8, or 11 complementary cDNA(s) using the corresponding primers complementary to the 3' end as indicated in Data S1 in the supplemental material. Immediately after synthesis of the first cDNA strand, samples were heated to 95°C for 10 min, and PCR was carried out with the corresponding 3' end primer and a primer complementary to the 5' end (see Data S1 in the supplemental material); PCR was carried out with 40 cycles of amplification (55 s at 94°C, 1 min at 50°C, and 7 min at 68°C) and a final extension of 10 min at 72°C. Amplified DNA was immediately cloned into the TA3pCR2.1 vector (Invitrogen Corp., San Diego, CA) according to the manufacturer's instructions. For accuracy in sequence determination, two to four independent clones of VP1 to VP3, NSP1 to NSP3, or NSP5/NSP6 from individual PCRs using strains BRV033, WC3, A131, A411, and A253 were sequenced by the dideoxynucleotide chain termination method. Confirmation of the DNA sequence was performed by sequencing both DNA strands of each of the different clones using the M13 sense and antisense standard primers and additional primers (available upon request). The 5' and 3' end sequences of the clones are identical or complementary to the corresponding 5' or 3' end PCR primers, respectively.
RNA extraction, RT-PCR, and sequencing of the genomes of human rotavirus strains D, DS-1, P, ST3, IAL28, SE584, 69 M, W161, A64, and L26 and primate rotavirus strains TUCH and RRV were performed as described elsewhere (E. Heiman, S. M. McDonald, M. Barro, Z. F. Taraporewala, T. Bar-Magen, and J. T. Patton, unpublished data; also H. Yang, Z. F. Taraporewala, K. Kerstak, and J. T. Patton, unpublished data).
Phylogenetic analysis. Phylogenetic and molecular evolutionary analyses were conducted both at the nucleotide and amino acid levels using MEGA, version 2.1, software. Genetic distances were calculated using the Poisson correction parameter at the amino acid level and the Kimura-2 correction parameter at the nucleotide level. The dendrograms were constructed using the neighbor-joining method (33).
Construction of pairwise identity frequency graphs. To obtain suitable cutoff values for evolution-based classification of each rotavirus genome segment, the percentages of nucleotide and amino acid identities between the complete open reading frames of a large number of completely sequenced rotavirus genome segments in GenBank as well as the sequences determined in this study were calculated using the pairwise distances program of the MEGA program, version 2.1 (33). The use of pairwise identity frequency graphs for the classification of viruses has been approved by the International Committee on Taxonomy of Viruses (2), and the pairwise identity frequency graphs were constructed by plotting all the calculated pairwise identities in a graph with the percentage identities in the abscissa (x axis) and the frequency of each of the calculated pairwise identities in the ordinate (y axis).
Determination of the most appropriate nucleotide/amino acid identity cutoff percentages. For each phylogenetic dendrogram, there were several alternatives to assign certain clusters as genotypes. To determine the most appropriate cutoff value for classification, several possible alternatives were investigated. For each alternative, the nucleotide/amino acid identities between strains belonging to the same cluster/genotype were designated the intragenotype identities, while the nucleotide/amino acid identities between strains belonging to different clusters/genotypes were designated the intergenotype identities. In the ideal case, the inter- and intragenotype identities should not overlap, and the cutoff percentage is defined as the percentage separating the inter- and the intragenotype identities. In practice, the inter- and intragenotype identities do partially overlap in some cases, and the most appropriate percentage cutoff value was chosen as the percentage at which the ratio of the intergenotype identity and the intragenotype identity (intergenotype identity/intragenotype identity) dropped below 1.
Nucleotide sequence accession numbers. The nucleotide sequence data reported in this paper were deposited in GenBank under the accession numbers found in Data S2 in the supplemental material.
|
|
|---|
The results of the analyses conducted both at the nucleotide and amino acid levels were in complete accordance, resulting in identical genotypes/clusters (data not shown). However, there were several reasons favoring the use of the nucleotide-based classification over the amino acid-based classification. (i) The resolution of the different "peaks" in the nucleotide-based frequency-identity graphs was generally much better than that in the amino acid-based frequency-identity graphs, resulting in little or no overlap between the peaks of the inter- and intragenotype identities, which resulted in a limited number of deviations from the proposed cutoff values. (ii) Since several of the rotavirus proteins are rather conserved (VP1, VP2, VP6, and NSP5), the resulting cutoff values based on amino acid identity were very high (up to 96%). The high level of relatedness of these proteins also makes it difficult to classify and differentiate new strains based on (partial) amino acid sequences, since there is only a very limited amount of sequence diversity. (iii) The phylogenetic dendrograms and the different genetic clusters which were used to define the different genotypes were supported by higher bootstrap values in the nucleotide-based phylogenetic dendrograms than in the amino acid-based phylogenetic dendrograms. (iv) The genetic diversity observed between different rotavirus strains was more evenly distributed across the gene at the nucleotide level than at the amino acid level. This implies that genotyping of a partial gene sequence, using the proposed cutoff values, would be much more reliable at the nucleotide level than at the amino acid level.
In addition to the use of pairwise distances for the construction of the frequency identity graphs, several other more sophisticated models to estimate the genetic distance between homologous sequences, such as Jukes-Cantor, Kimura-2, and Tamura-Nei (54), were evaluated. These different models yielded slightly lower cutoff percentage values, but, most importantly, the resulting phylogenetic dendrograms demonstrated identical sets of genotypes for all 11 genes, regardless of which of the mathematical genetic distance models was used (data not shown).
VP4. More than 190 complete nucleotide and deduced amino acid sequences for VP4 were used in the analysis. The dendrograms based on the nucleotide and amino acid sequences are shown in Fig. 1A and B, respectively. The vertical dashed lines represent the division into different P genotypes as currently established, and the two resulting identity frequency graphs are shown in Fig. 1C and D. From these graphs, it was not obvious how to determine accurate nucleotide cutoff values or how to confirm that the established cutoff value of 89% at the amino acid level was appropriate since there was considerable overlap between the inter- and intragenotype identities (Fig. 1C and D). To further analyze the overlap between the inter- and intragenotypic identities, details of the boxed areas in Fig. 1C and D are shown magnified in Fig. 1E and H. In addition, Fig. 1F and I show the intergenotype identities between rotaviruses belonging to genotypes P[4] and P[8]; then in Fig. 1G and J, the identities between P[4] and P[8] were omitted to fully interpret the analyses. Figure 1G and J clearly show that an 80% cutoff value at the nucleotide level and the 89% cutoff value at the amino acid level, as currently established, were accurately suited to distinguish between different VP4 genotypes as they may have evolved. Among all P genotypes evaluated, only the identities between the P[4] and P[8] genotypes were not separated by the 80% nucleotide cutoff value (range, 84 to 89% nucleotide identity) or the 89% amino acid cutoff values (range, 85 to 93% amino acid identity), confirming the observation that these P genotypes are closely related. This close relationship between P[4] and P[8] should be kept in mind when newly sequenced VP4 gene segments are analyzed for their P genotype.
![]() View larger version (36K): [in a new window] |
FIG. 1. Phylogenetic dendrograms of VP4 at the nucleotide (nt) level (A) and the amino acid (aa) level (B) Bootstrap values (1,000 replicates) are shown. Certain clusters are replaced by triangles, in which the height of the triangle represents the number of sequences, and the width represents the genetic diversity inside that cluster. The dashed lines indicate the current division into P genotypes. Panels C and D show the respective identity frequency graphs. Panels E, F and G show details of the boxed regions of panel C, and panels H, I, and J show details of the boxed regions of panel D. In panels F and I, the identities between rotavirus strains belonging to the P[4] and P[8] genotypes are shown. In panels G and J, the identities between rotavirus strains belonging to the P[4] and P[8] genotypes are omitted to show that the 80% nucleotide and 89% amino acid cutoff values are the most suited cutoff values.
|
VP7. More than 1,000 VP7-encoding nucleotide and deduced amino acid sequences, representing the 15 established G genotypes, were used for the analysis. The recently described equine strain ERV99, identified as a new G genotype (21), was revealed to be a G6 rotavirus strain with a high level of identity to NCDV (data not shown). The nucleotide- and amino acid-based phylogenetic dendrograms are shown in Fig. 2A and B, respectively. Identity frequency graphs were constructed according to the 15 G genotypes, as they are currently defined, and the 89% amino acid cutoff value is shown as a vertical line in Fig. 2B. At the nucleotide level, an 80% cutoff value seemed the most appropriate based on the data shown in Fig. 2C. Details of Fig. 2C and D (boxed regions) are depicted in Fig. 2E and I and show that a considerable degree of intraspecies diversity was found below the nucleotide and amino acid cutoff percentages, and a lower degree of interspecies identity was observed above the indicated cutoff percentages. Figure 2F and J show that a large proportion of the intragenotype identities, below the proposed cutoff values, were the result of identities between murine rotaviruses and G3 rotaviruses isolated from other species. Their sequence identities ranged from 73% to 79% (Fig. 2F, dark green bars) at the nucleotide level, completely below the 80% nucleotide identity cutoff value, and ranged from 80% to 92% at the amino acid level, which is almost entirely below the 89% amino acid identity cutoff value. This result strongly suggests that the murine rotavirus strains should be assigned to a different G genotype, tentatively designated G16. The detailed phylogenetic relationship between G3 rotavirus strains and the murine rotavirus strains is available in Data S3 in the supplemental material.
![]() View larger version (50K): [in a new window] |
FIG. 2. Phylogenetic dendrograms of VP7 at the nucleotide (nt) level (A) and the amino acid (aa) level (B). Bootstrap values (1,000 replicates) are shown. Certain clusters are replaced by triangles, in which the height of the triangle represents the number of sequences, and the width represents the genetic diversity inside that cluster. The dashed lines indicate the current division into G genotypes. Panels C and D show the respective identity frequency graphs. Panels E, F, G, and H show details of the boxed regions of panel C, and panels I, J, K, and L show details of the boxed regions of panel D. In panels F and J, identities between rotavirus strains belonging to the murine G3 strains and the remaining G3 strains (dark green), between rotavirus strains belonging to the G5 and G11 strains (red), and among rotavirus strains belonging to the G7 genotype (dark blue) are shown. In panels G and K identities among rotavirus strains belonging to the G3 genotype (light green) and between rotavirus strains belonging to the G3 and G14 genotypes (turquoise) are shown.
|
Some of the intergenotype identities above the cutoff values (Fig. 2F and J) were caused by identities between G5 and G11 rotavirus strains (red bars), ranging from 80% to 85% at the nucleotide level (completely above the 80% cutoff value) and from 82% to 92% at the amino acid level (partially above the 89% cutoff value). This suggests that strains assigned to G5 and G11 could be considered a single genotype, especially when analyzed at the nucleotide level. The detailed phylogenetic relationship between G5 and G11 rotavirus strains is available in Data S5 in the supplemental material. Figure 2G and K show that most of the remaining intragenotype identities below the cutoff values were due to identities between rotaviruses belonging to the G3 genotype (light green bars). Also the remaining intergenotype identities above the cutoff values were mostly caused by identities between G3 and G14 rotaviruses (turquoise bars). Figure 2H and L show that, after the omission of the above-mentioned identities, an 80% identity cutoff value at the nucleotide level and an 89% identity cutoff value at the amino acid level were the most appropriate. These values allowed the identities between strains belonging to the same genotypes to be mostly above the proposed cutoff values and identities between strains that belong to different genotypes to be mostly below the proposed cutoff values. The very last remaining intergenotype identities above 89% (Fig. 2L) were due to identities between rotavirus strains belonging to the G3 and G9 genotypes. Finally, the remaining intragenotype identities below 89% (Fig. 2L) were due to a high level of intragenotype diversity among rotavirus strains belonging to genotypes G1, G2, G4, and G6.
VP6. A phylogenetic dendrogram constructed with 142 VP6 nucleotide sequences is shown in Fig. 3A. Several alternatives were tested to divide the dendrogram into appropriate genotypes, but only the best alternative is shown in Fig. 3A (vertical dashed line). The corresponding identity frequency graph (Fig. 3B) suggested that 85% was the most appropriate percentage identity cutoff value. A detailed view of the boxed region of Fig. 3B is shown in Fig. 3C. A total of 10 VP6 genotypes were determined and tentatively designated I (for intermediate capsid shell) genotypes, which contrasts with the two broad genogroups identified when only a 379-bp fragment of VP6 was analyzed (27). Further analyses of the origin of the intragenotype identities below the proposed cutoff value demonstrated that the greatest contribution to these identities (Fig. 3D, dark green bars) was the identity among strains assigned to genotype I2. The few remaining intragenotype identities below the 85% cutoff value (Fig. 3E) were caused by identities between the porcine rotavirus strain A411 and other rotavirus strains belonging to the VP6 genotype I5.
![]() View larger version (25K): [in a new window] |
FIG. 3. Phylogenetic dendrogram of VP6 at the nucleotide (nt) level (A). Bootstrap values (2,000 replicates) are shown. Designations of species of origin are as follows: Bo, bovine; Hu, human; Rh, rhesus; Eq, equine; Po, porcine; Ov, ovine; La, lapine; Si, simian; Mu, murine; Av, avian. The dashed line indicates the best option to divide the dendrogram into appropriate genotypes. The closing braces on the right side of the dendrograms depict the genotypes as they are proposed in this study. Panel B shows the identity frequency graph. Panels C, D, and E show details of the boxed region of panel B. In panel D, the identities among rotavirus strains belonging to the I2 genotype are shown. VP6 nucleotide accession numbers collected from GenBank are as follows for the indicated strains: BRV033, AF317126; B223, AF317128; WC3, AF411322; NCDV, AF317127; RRV, EF583009; OH4, D82975; PA169, EF554130; RF, K02254; Po, DQ119822; 111/05-27, EF554141; UKtc, X53667; R-22, D82977; OVR762, EF554152; B10925-97, EF554119; HP113, DQ003294; HP140, DQ003295; I321, X94618; 22R, AB040055; US6259, EF426123; US8720, EF426124; US8922, EF426132; US8635, EF426133; N26-02, DQ146686; US1205, AF079357; DRC86, DQ005121; DRC88, DQ005110; IS2, X94617; B1711, EF554086; US5139, EF426130; US8908, EF426140; TK126, AY456528; RV176-00, DQ490555; RV161-00, DQ490549; NR1, AF309652; TK119, AY456527; MG6, EF554097; AY456527; B4106, AY740737; 30/96, DQ205226; SA11-tsG, L15384; SA11-H96, DQ838650; SA11-30/19, DQ838648; SA11-5N, DQ838646; SA11-30/1A, DQ838649; SA11, AY187029; SA11-5S, DQ838647; 1076, D00325; Hun5, EF554108; FI-23, D82971; H-2, D00324; L26, DQ146695; TB-Chen, AY787645; S2, Y00437; Lp14, L11595; TUCH, AY594670; CMH222, ABC41660; T152, DQ146702; CMH134/04, DQ923800; CMH120/04, DQ923796; EO, AY947543; EMcN, AY267007; EDIM, U65988; EW, U36474; RMC321, AF531913; RMC/G60, AAT99008; RMC/G7, AY601551; RU172, DQ204741; 4F, L29184; 4S, L29186; CMP034, DQ534018; CRW-8, U82971; CN86, U10031; A411, AF317125; YM, X69487; A253, AF317122; A131, AF317124; H-1, AF242394; OSU, AF317123; HO-5, D82973; R-13, D82976; FI-14, D00323; HI-23, D82972; R-3, D82978; L338, D82974; Gottfried, D00326; CJN, AF461757; US6253, EF426131; KU, AB022768; 116E, U85998; Hu, X57943; Wa, K02086; E210, U36240; 97B53, AF260931; RMC61, AY601549; B4633-03, DQ146642; RV3, U04741; US9810, EF426127; US0408, EF426119; Dhaka12-03, DQ146664; Matlab13-03, DQ146675; Dhaka16-03, DQ492673; Dhaka25-02, DQ146653; RMC437, AY601554; US0468, EF426120; RMC/G66, AY601553; RMC100, AF531912; RMC83, AY601550; US8673, EF426135; US6097, EF426134; US9828, EF426139; US6153, EF426121; US8616, EF426138; US9951, EF426129; US9825, EF426128; US9875, EF426137; US9874, EF426136; US8960, EF426125; US8979, EF426126; US6161, EF426122; Ch-1, D82970; Ch-1, X98870; 02V0002G3, DQ096805; PO-13, D16329; 993-83, L13765; RK3, BAA22523; Ty-1, D82980; Ty-3, D82981; Ty-3, X98872.
|
VP2. For the gene encoding VP2, no unambiguous nucleotide alignment could be obtained for the 5' end region (nucleotides 1 to 134) since numerous insertion/deletions are present in this region. At present, the reason or function of this sequence heterogeneity is unknown. For this reason, the gene 2 alignments preceding nucleotide 134 (amino acid 40) with respect to human rotavirus reference strain Wa (accession numbers X14942 and CAA33074) were omitted from the analyses. The resulting phylogenetic dendrogram (see Data S7A in the supplemental material) was based on 58 complete VP2 nucleotide sequences. The most appropriate classification of VP2 into five genotypes, tentatively designated C (for core shell protein) genotypes, is shown by the vertical dashed line in Data S7A in the supplemental material. This classification was best reflected by a nucleotide identity cutoff value of 84%, as shown in the identity frequency graph in Data S7B in the supplemental material. The few intragenotype identities below the proposed 84% cutoff value (see Data S7C, D, and E in the supplemental material) are entirely due to the identities between the three strains A131 (porcine), A253 (porcine), and BRV033 (bovine) and the other members of the C2 genotype.
VP3. Sixty-seven VP3 nucleotide sequences were used to construct the phylogenetic dendrogram shown in Data S8A in the supplemental material. The best alternative to classify the dendrogram into six suitable VP3 genotypes, tentatively designated M (for methyltransferase) genotypes, is shown as a vertical dashed line. The corresponding identity frequency graph (see Data S8B in the supplemental material) showed that an 81% nucleotide identity cutoff value was a perfect reflection of these proposed genotypes. Interestingly, the VP3-encoding gene segment of L26 determined in this study belonged to genotype M2. In contrast, the VP3-encoding gene of L26 as described previously by Cook and McCrae belonged to the M1 genotype (11). The reason for this discrepancy is not known.
NSP1. One hundred complete NSP1 nucleotide sequences were used to construct the phylogenetic dendrogram shown in Data S9A in the supplemental material. The most suited alternative to divide the dendrogram into appropriate genotypes revealed 14 NSP1 genotypes, tentatively designated A (for interferon antagonist) genotypes (indicated by the vertical dashed line in Data S9A in the supplemental material). A 79% nucleotide identity cutoff value accurately reflects the classification as shown in the corresponding identity frequency graph (see Data S9B in the supplemental material). The few intragenotype identities below the 79% cutoff value were all due to identities among strains belonging to the A1 genotype.
NSP2. Seventy-one NSP2 nucleotide sequences were used to construct the phylogenetic dendrogram in shown in Data S10A in the supplemental material. The best alternative to classify the dendrogram into five suitable NSP2 genotypes, tentatively designated N (for NTPase activity) genotypes, is shown as a vertical dashed line (see Data S10A in the supplemental material). The corresponding identity frequency graph (see Data S10B in the supplemental material) showed that an 85% nucleotide identity cutoff value was a good reflection of these proposed genotypes. The few intragenotype identities below the 85% the cutoff value were mainly due to identities among strains belonging to the N1 genotype, and the few intergenotype identities above the cutoff value are due to identities between N1 and N2 (see Data S10B in the supplemental material).
NSP3. Seventy-seven NSP3 nucleotide sequences were used to construct the phylogenetic dendrogram in shown in Data S11A in the supplemental material. The best alternative to classify the dendrogram into seven suitable NSP3 genotypes, tentatively designated T (for translation enhancer) genotypes, is shown as a vertical dashed line (see Data S11A in the supplemental material), and the corresponding identity frequency graph (see Data S11B in the supplemental material) showed that an 85% nucleotide identity cutoff value was the most appropriate for these proposed genotypes. The few intragenotype identities below the 85% cutoff value are due to identities among strains belonging to the T1 genotype.
NSP4. For NSP4, a classification into six genotypes (A to F) is currently used, based on the clustering into amino acid-based phylogenetic dendrograms using approximately 100 sequences (9, 22, 26, 46). In our analysis, a phylogenetic dendrogram was constructed with 430 available NSP4 nucleotide sequences (Fig. 4A). Large clusters have been replaced by triangles, but the full dendrogram is shown in Data S12 in the supplemental material. Several alternatives to divide the dendrogram into appropriate genotypes were analyzed, but only the best alternative is shown in Fig. 4A (vertical dashed line). The corresponding identity frequency graph (Fig. 4D) suggested that 85% was the most appropriate percentage identity cutoff value although a considerable number of intragenotype identities were found below this cutoff value. Although six NSP4 genotypes have been described to date, a total of 11 NSP4 genotypes, largely corresponding with the existing classification, were determined and tentatively designated E (for enterotoxin) genotypes when all available sequences were utilized. NSP4 genotype E1 corresponded completely with the Wa-like (genotype B) NSP4 genotype; genotype E2 largely corresponded to the KUN-like (genotype A) NSP4 genotype and was the most diverse genotype, as the majority of the rotavirus strains isolated from several species of origin (human, bovine, equine, simian, and ovine) clustered together, although the lapine or lapine-like human rotavirus strains were not included; genotype E3 corresponded completely to the AU-1-like (genotype C) NSP4 genotype; genotype E4 (previously NSP4 genotype E) was restricted to avian rotavirus strains (PO-13 and Ty-1); genotype E5 was composed of lapine and a human lapine-like rotavirus strains, previously belonging to the KUN-like (genotype A) NSP4 genotype. The assignment of the lapine NSP4 sequences into a single distinct genotype is also supported by previous observations showing a rather low degree of identity between lapine rotavirus strains and the remaining NSP4 genotype A strains (9); genotype E6 was restricted to newly described unusual human G12 strains (N26-02 and RV176-00) (63); genotype E7 (previously NSP4 genotype D) was composed entirely of murine rotavirus strains; genotype E8 contained a bovine strain (PP-1, which was originally proposed to belong to the NSP4 genotype B) (15) and two canine strains (RV52/96 and RV198/95); genotype E9 contained a human (A_G4_120) and a porcine (CMP034) strain that were described recently (30), while genotype E10 (previously NSP4 genotype F) contained the avian strain Ch-1 only; and genotype E11 was composed of the avian rotavirus strains Ty-3 and AvRV-1. The majority of the newly added NSP4 genotypes are due to the inclusion of new NSP4 sequences found in the GenBank database. A detail of the boxed region of Fig. 4D is shown in Fig. 4E and reveals that the greatest contribution to the intragenotype identities below the proposed cutoff value was generated by the identities among strains assigned to the three largest genotypes, E2, E1, and E3 (Fig. 4F), in descending order.
![]() View larger version (21K): [in a new window] |
FIG. 4. Phylogenetic dendrogram of NSP4 at the nucleotide (nt) level (A). Bootstrap values (2,000 replicates) above 50 are shown. Designations of species of origin are as follows: Bo, bovine; Hu, human; Po, porcine; Ca, canine; Av, avian. The dashed line indicates the best option to divide the dendrogram into appropriate genotypes. Certain clusters are replaced by triangles, in which the height of the triangle represents the number of sequences, and the width represents the genetic diversity inside that cluster. The full NSP4 phylogenetic dendrogram is provided in Data S6 in the supplemental material as well as the accession numbers used to construct the dendrogram. The closing braces on the right side of the dendrograms depict the 11 E genotypes as they are proposed in this study (C). Panel B depicts the genotypes as they were previously used. Panel D shows the identity frequency graph. Panels E, F, and G show details of the boxed region of panel D. In panel F, the identities among rotavirus strains belonging to the E3, E1, and E2 genotypes are shown. In panel G, the above-mentioned bars are omitted.
|
Recommendations for a rotavirus classification system of all 11 genome segments. A summary of all 11 calculated cutoff values for the rotavirus genes segments is shown in Table 1. With these nucleotide cutoff percentages and the resulting appropriate genotypes for all the rotavirus proteins, a complete nomenclature and classification system can be proposed based on the evidence provided in this study. To designate the complete genetic constellation of a virus, the following schematic nomenclature is suggested: Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx, representing the genotypes of, respectively, the VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5 genes, with x indicating the numbers of the corresponding genotypes. In Table 2, this classification system has been applied to a number of human, bovine, porcine, avian, and simian rotavirus strains, demonstrating the practical use of the system in identifying reassortment events and examples of interspecies transmission and providing evidence on the role of animals as a source of rotavirus infection in humans. An extended table with more strains, including equine, murine, lapine, and ovine strains is available in Data S13 in the supplemental material.
|
View this table: [in a new window] |
TABLE 1. Nucleotide (amino acid) percentage identity cutoff values defining genotypes for 11 rotavirus gene segments
|
|
View this table: [in a new window] |
TABLE 2. Application of the proposed classification and nomenclature to the structural and nonstructural protein-encoding genes for known human, bovine, porcine, simian, and avian rotavirus strainsb
|
Origin of rotaviruses infecting humans. There has been speculation on the role of animals as a source of rotavirus infection in humans (5). As Table 2 and Data S13 in the supplemental material show, certain naturally occurring animal rotavirus strains may infect humans, but this type of interspecies transmission appears to be a rare event (5, 8, 15, 31, 38, 50, 52, 53).
Colors were added to Table 2 to visualize certain patterns or constellations of genes more easily: green, red, and orange were used for Wa-like, DS-1-like, and AU-1-like genogroup gene segments, respectively. Yellow, blue, and purple were used for the avian PO-13-like rotavirus gene segments; some typical porcine VP4, VP7, and VP6 genotypes; and the SA11-like gene segments, respectively. With the help of the color codes in Table 2, the source or origin of some human rotavirus strains, namely the Wa-like and DS-1-like genogroup human strains, became evident.
First, the most widespread human rotavirus strains belonging to the Wa-like genogroup (G1P[8], G3P[8], G4P[8], and G9P[8]) share the majority, if not all, of the genotypes with porcine rotavirus strains: R1-C1-M1-A1-N1-T1-E1-H1, corresponding to VP1 to VP3 and NSP1 to NSP5 genotypes (Table 2). In terms of the VP4, VP7, and VP6 genotypes, the human Wa-like rotavirus strains are mostly found to possess the G(1, 3, 4, or 9)-P[8]-I1 genotypes, while the porcine rotavirus strains mostly posses the G(5 or 11)-P[7]-I5 genotypes. Genotypes G1, G3, G4, G5, G9, G(5 or 11), and G12 have been also shown to infect pigs and humans (Table 2). The observation that the genes encoding VP4, VP7, and VP6 are partially distinct between the human Wa-like rotavirus strains and porcine rotavirus strains might be explained by the different selective pressures triggered in different host species. The close evolutionary relationship between NSP1, NSP2, and NSP4 of human Wa-like and porcine rotavirus strains has been described previously (9, 31, 61), and RNA-RNA hybridization studies showed at least five to six hybrid bands between porcine rotavirus strain YM and some Wa-like genogroup human rotavirus strains (31). Together, these data strongly suggest a common origin between human Wa-like strains and porcine rotaviruses.
Second, human rotavirus strains belonging to the DS-1-like genogroup shared the majority of their genotypes, namely VP6, VP1, VP2, VP3, NSP2, and NSP4 (I2-R2-C2-M2-N2-E2), with thoseof bovine rotavirus strains (Table 2). The different G and P genotypes commonly found associated with human DS-1 genogroup rotaviruses (G2 and P[4]) and bovine rotaviruses (G6, G8, and G10 and P[5] and P[11]) might again be due to different selective pressures in these two host species. However, the genotypes of NSP1 (A2 and A3), NSP3 (T2 and T6/T7), and NSP5 (H2 and H3), which are commonly found in human DS-1-like and bovine rotavirus strains, respectively, are not phylogenetically closely related (see Data S9, S11, and S13 in the supplemental material). A close evolutionary relationship between human DS-1-like and bovine rotaviruses has been previously described for several gene segments such as NSP2, NSP4, and VP3 (9, 11, 61). Again, these data point toward a common origin between the human DS-1-like rotavirus strains and bovine rotaviruses although a few gene segments might have been replaced by reassortment. The third, but minor, human AU-1-like genogroup (G3-P[9]-I3-R3-C3-M3-A3-N3-T3-E3-H3), is believed to have a close evolutionary relationship with canine and feline rotavirus strains, based on RNA-RNA hybridization studies (53). The very few sequence data that are available for feline rotaviruses (Table 2; see also Data S13 in the supplemental material) support this hypothesis, but additional sequence data of feline, and possibly canine, strains are necessary to properly answer this question. These results suggest that human and animal rotavirus strains are linked very closely and emphasize the importance of monitoring rotaviruses in animal populations, as well as following uncommon rotavirus strains in the human population very closely.
|
|
|---|
Our phylogenetic classification of VP4 into different P genotypes showed that an 80% identity cutoff value was in accordance with the established P genotypes. However, identities between P[8] and P[4] ranged from 84% to 89%, completely above the 80% cutoff value, reinforcing the idea that P[8] and P[4] are not only subtypes P1A and P1B of one serotype (16, 18, 23) but also subtypes of one genotype. The three recently described novel porcine P genotypes, represented by strains 344/04-1, CMP034, and P21-5, isolated in Italy, Thailand, and Slovenia, respectively (30, 35, 68), were all classified as distant members of the P[27] genotype, as was noted before (60).
The phylogenetic analyses of the VP7 gene sequences revealed that 80% and 89% identity cutoff values at the nucleotide and amino acid levels, respectively, provided the most appropriate classification from an evolutionary perspective. However, the classification proposed herein does not fully concur with the current VP7 classification scheme, as outlined below.
(i) Until now, all the VP7 genes of rotaviruses isolated from mice have been designated G3 or G3-like (14, 41, 74). Our phylogenetic analyses indicate that the murine rotaviruses could be assigned to a new G genotype, tentatively designated G16. Serological analyses of five murine strains, EW (also known as EDIM), EB, EC, EL, and EHP, have revealed that strains EW and EHP are not recognized by G3-specific sera or G3-specific MAbs (6, 14, 20). Among the murine rotavirus strains, only strain EB meets traditional criteria as serotype G3, while murine strain EC, and probably EL, are recognized by most, but not all, G3 serotype-specific MAbs (14). The American murine rotavirus strain EMcN showed a diverse pattern of reactivity with G3-specific hyperimmune sera, and the Japanese murine rotavirus strain YR-1 did not react with G3 specific MAbs at all (41, 74). We recognize that some murine strains may exhibit a dual VP7 serotype similar to strains MDR-13 (G3 and G5) and IAL28 (G5 and G11) and escape mutants of strain A253 (G11 and G3) (7, 16, 72).
(ii) All avian rotaviruses have been classified as serotype G7 even though there are several conflicting serotyping reports in the literature (25, 43, 65). The traditional avian genotype G7 and the three proposed novel avian genotypes (G17, G18, and G19) show less than 78% and 84% identity at the nucleotide and amino acid levels, respectively, which are differences very similar to those observed between different mammalian rotavirus genotypes (data not shown).
(iii) As reported previously (7, 72), G5 and G11 rotavirus strains are genotypically closely related. A human rotavirus strain, IAL-28, exhibits both G5 and the G11 serotype specificities (72), further strengthening the observation that G5 and G11 are closely related. With a few exceptions (8, 64, 73), both G5 and G11 rotaviruses have been mainly isolated from pigs. Since genotype G11 is closer to genotype G5 than to genotype G3, it is interesting that escape mutants of porcine strain A253 (G11) exhibited both G11 and G3 serotype specificities instead of G11 and G5 serotype specificities (7). Until the antigenic relationships between G5 and G11 rotavirus strains are fully understood, we propose to keep these two genotypes separate as they are clearly antigenically distinct.
(iv) Finally, the large sequence identity range found for G3 rotaviruses (Fig. 2G, light green bars) is not to be ignored. Given that G3 rotaviruses have the largest spectrum of host species and a higher degree of intergenotype sequence diversity than other genotypes (57), it does raise the level of complexity of the phylogenetic analyses (see discussion of murine rotavirus strains above) (29). In addition G3 and G14 rotavirus strains are more closely related to each other serologically than different genotypes usually are (10).
The high level of intragenotype sequence diversity within the VP2 and VP6 genotype 2 (C2 and I2, respectively), together with the broad range of host species within these genotypes, resembles features observed in the VP7 serotype G3 (29, 57). A remarkable observation is that except for VP6 genotypes I2 and I5, all the remaining eight VP6 genotypes contain rotaviruses isolated from one or two host species. No correlation was found between our VP6 genotyping system and VP6 SG specificity (data not shown) as several of the VP6 genotypes contain strains with different SG specificities (17, 27, 41, 69).
The identification of distinct genotypes for each rotavirus genome segment yielded multiple examples of related strains (Wa and KU, DS-1 and TB-Chen, and Dhaka25-02 and B4633-03), which cluster closely together in all 11 phylogenetic dendrograms, suggesting similar patterns of evolution. Several of the phylogenetic dendrograms, such as those for VP1, VP2, VP3, VP6, NSP2, and NSP3, show similar branching patterns for most of the established genotypes.
Our classification system concurs with the main group A rotavirus genogroups (Wa, DS-1, and AU-1), as established through RNA-RNA hybridization analyses (51). In line with the established human genogroups Wa, DS-1, and AU-1, the genotypes for each genome segment of these reference rotavirus strains were assigned to genotypes 1, 2, and 3, respectively, for consistency. All genome segments of the avian rotavirus PO-13 strain were assigned to genotype 4. For the genome segments encoding NSP1, NSP2, NSP3, VP2, and VP3, genotype 5 was assigned to the cluster containing the SA11 clones. The remaining genotypes of the structural and nonstructural protein-encoding genes were assigned randomly, but systematically.
The demonstration that Wa-like human rotaviruses possess similar constellations of genes and, hence, a close genetic relationship to porcine rotaviruses while the DS-1-like human rotaviruses have a close relationship with bovine rotaviruses suggests provocative ideas about the role of animals as a source of rotavirus infection in humans. The fact that each of the three main human rotavirus families may have a different animal origin would be instrumental in obtaining a better understanding of rotavirus host-range restriction and rotavirus ecology in nature. It is of interest that the only animals that can be productively infected experimentally by human rotavirus strains are piglets and calves (4). Piglets are monogastric animals with intestinal physiology that resembles that of humans and are susceptible to infection and severe disease by the human rotavirus Wa strain (75) but not the DS-1 strain (24). On the other hand, human DS-1-like rotaviruses have been shown to successfully infect cattle (45, 77), and the human Wa-like strain D can also infect and induce mild disease in calves (44). So far, dogs have been experimentally infected only with canine rotavirus strains (28).
To date a total of 35 human rotavirus strains have been completely sequenced, while only five and four porcine and bovine strains, respectively, have been completely sequenced (16, 38, 39, 63; also this report). Additional porcine and bovine rotavirus strains will be fully sequenced, allowing a further investigation of whether pigs and cows act as reservoirs for Wa-like and DS-1-like human rotavirus strains, respectively. Likewise, we are currently sequencing all the genome segments of feline and canine strains to determine if these domestic animals act as reservoirs for AU-1-like rotavirus strains in humans.
A further advantage of the establishment of this novel genotyping classification system is that a more systematic approach can be used to investigate possible genetic linkages among rotavirus genome segments. A comprehensive evaluation to investigate the possible genetic linkage between (sets of) genes for all 11 genes is being currently conducted (unpublished data).
In summary, we calculated nucleotide percentage identity cutoff values to define genotypes for the gene segments encoding the 11 proteins of group A rotaviruses. Appropriate genotypes were defined, and a novel nomenclature system is being proposed, allowing a rational comparison between different rotavirus strains and international standardization. The application of this classification system to a large number of rotavirus strains identified distinct genotypes in all genes, which probably follow separate evolutionary paths, and allow the identification of multiple reassortment and interspecies transmission events. The system also revealed the possible animal origin of the most common human rotavirus strains. The data reemphasize the rationale for simultaneous analysis of animal and human rotavirus strains.
J.M. was supported by the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT Vlaanderen). J.T.P. and S.M.M. were supported by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases, NIH. E.H. was supported by an appointment to the Oak Ridge Associated Universities Research Associates/Specialists Program at the NIH.
Published ahead of print on 23 January 2008. ![]()
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
a, M. E. Conner, and F. Liprandi. 2001. Antigenic and molecular analyses reveal that the equine rotavirus strain H-1 is closely related to porcine, but not equine, rotaviruses: interspecies transmission from pigs to horses? Virus Genes 22:5-20.[CrossRef][Medline]This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»