Distance analysis showed that the population of sequences between patients could vary from 5.5 to 11.9% with a median of 8.7% (). Within patient sequence divergence was much lower, ranging from 1.2 to 3.9% with a median of 2.2% (). A t-test showed that the sequence divergence between patients was significantly higher than the sequence divergence within patient sequence populations (p < 0.0001).
Fig. 2 Distance within patient sequence populations. Distances within each patient’s sequences are provided as a percent with standard errors displayed on each bar. The number of sequences used for each patient is shown below the patient designation. (more ...)
Each patient contained a dominant pre-miRNA sequence for pre-miR-H1 that is shown in . In , the dominant mature HIV-1 miR-H1 sequences are provided with variable positions indicated. In –, we provide a summary of all the pre-miR-H1s for each patient as follows: (1) Group 1 is the dominant group, (2) Groups 2–11 contained three or more sequences, and (3) the final group for each patient contains all sequences with two or less representative stem-loop structures. The number of sequences in each group, along with the number and type of tissues the sequences were derived from are also shown. In the lymphoma patients (AM and IV), tissues are further classified as tumor, non-tumor, or lymph nodes. In the patients where sequences from brain tissues were available (BW, AZ, DY, CX, GA), the tissues are also identified as brain or non-brain.
HIV-1 mir-H1 stem-loop sequences.
Aligned mature miR-H1 sequences.
Patients with brain infection or clinical HAD (GA, CX, DY, BW) shared similar miR-H1 structures (). Additionally, eight of the 10 miR-H1 structures built from molecular clone sequences contained structures similar to GA, CX, DY, and BW in that each had a similar length, no apparent deletion or insertion and a similar loop (Supplemental Figure 2
). The structure for patient AZ, a patient with cardiovascular disease (CVD), was longer than all other miRNAs examined by 12 nucleotides, most of which were located in the terminal loop and away from the mature miRNA sequence. Two patients with ARL contained much different miR-H1 structures. Patient AM exhibited a very large bulge in the center of the stem that contained a significant proportion of the mature miRNA sequence. Patient IV no longer has the mature miR-H1 in any sequences. illustrates that despite this deletion, patient IV’s pre-miR structure can be reconstituted if additional sequence data upstream of the deletion is used to complete the structure. A similar approach was taken for two sequences derived from patient CX’s periventricular space that also exhibited deletions within the HIV-1 miR-H1 (). All three structures are consistent with a viable pre-miR structure, especially the two structures derived from the CX deletions.
Fig. 3 Dominant stem-loop structures of the HIV-1 miR-H1 sequence for all patients. The published stem-loop structure is shown in the upper left of the figure. The mature miRNA sequence, which would be excised after dicer processing, is highlighted in blue. (more ...)
shows substitution patterns within the stem-loop structure for each patient and each of their non-identical sequences. Each nucleic acid substitution is shown along with its associated group number. Substitutions that could alter the Group 1 pre-miR-H1 structure are highlighted in red. A substitution matrix developed for each patient clearly indicates a bias towards G → A transitions within the overall data set (n
= 161), followed by U → C (n
= 79) an almost equal amount of A → G (n
= 42) and C → U (n
= 39) transitions. As would be expected in the paired regions of a stable RNA structure, transversions were rare. Substitution matrices for all patients are provided as Supplemental Figure (S2)
. A substitution matrix compiled from all patients is shown in .
Fig. 6 Nucleic acid substitution patterns for each patient over all sequence groups. Notation used for structures is identical to that in . Sequence groups were defined in –. Arrows point to substitutions. Substitutions are indicated (more ...)
Alignment of the stem-loop sequences showed three nucleic acid insertions in our data that had not been reported previously in the published sequence. These positions are highlighted in and . The first insertion, either an adenine or guanine, occurred within the mature miRNA sequence. BLAST revealed that this insertion was found in most HIV-1 LTR sequences (>6000 sequences) within GenBank. These results suggest that the published miR-H1 mature sequence is uncommon and not likely to be the majority structure for this miRNA family. The second insertion, an AG, occurred at the 3′ end of the loop structure. The insertion was also highly represented in the GenBank sequences; however, it is unlikely that the insertion of AG nucleotides in the loop would significantly alter the stability of the pre-miRNA structure.
The stability of the miRNA stem-loop secondary structure can be estimated by (1) calculating the energy needed for folding (the lower the energy, the more likely a miRNA structure will form), (2) calculating the GC content of the structure (the higher the GC content, the more likely the region is to be stemmed), (3) observing the number of paired and unpaired nucleotides in the stem and, (4) observing the size of the hairpin loop (less than 5 or more than 10 nucleotides form less stable structures) (Griffiths-Jones et al., 2003
; Lim et al., 2003
; Pfeffer et al., 2004
). A comparison of these parameters for each of the dominant sequences is presented in . Three of our sequences had a lower folding energy than the previously published miR-H1 structure (sequences from patients AZ, CX-1B, and CX-2B). Patient AZ’s decreased folding energy is likely due to the increased the size of the structure and the increased loop length. It is interesting that the putative H1 miRNA structures constructed using data immediately adjacent to the deletions for patient CX appeared to be quite stable ( and ). One of patient CX’s structures contains the mature miRNA sequence (CX-2B) and the other does not (CX-1B). The new miRNA for patient IV has a low folding energy, but contains a large number of unpaired nucleotides, therefore it is unclear how stable this structure would be in nature. The miRNA for patient AM is the least stable miRNA structure found in our data set.
A dominant pre-miR-H1 sequence usually appeared in most, if not all, of the sampled patient’s tissues (–), suggesting that the HIV miRNA structure for each patient generally did not mutate due to tissue-selective immune pressures and that there was selection for maintenance of specific structures within each patient. This hypothesis is supported by the fact that variants differing from the dominant pre-miR-H1 structure in patients were rare. Exceptions included non-tumor stomach sequences from patient IV, which contained many adenine substitutions (highlighted in green in ). A phylodynamic study (Salemi et al., 2009
) also determined that this patient’s stomach sequences were highly divergent from tumor and lymph node sequences. Nine liver-derived sequences in patient DY (Group 3) also contained a structure with multiple adenine substitutions (highlighted in green in ). These nine sequences would result in major structural changes in the HIV-1 pre-miR-H1 and generate the highest folding energy of all sequences examined (−25.50 kcal/mol). Temporal lobe sequences from patient GA and frontal lobe white matter from patient BW were not found in the dominant group; however the changes from the dominant sequence were slight and did not alter the structure.
shows all of the variation in each patient over the new HIV-1 pre-miR-H1 consensus structure. Substitutions in the center of the stem structure opposite of the mature miRNA sequence were rare, found only randomly in individual sequences (motif = CGUCGACGAAUAUACGUCGUAGACUCCCGA). Substitutions within the mature HIV-1 miR-H1 sequence were found more frequently in the patients as follows: GA = 5 groups, BW = 3 groups, CX = 5 groups, DY = 4 groups, AZ = 2 groups, AM = 8 groups, IV = not applicable due to deletion. The increased number of substitutions within the AM mature miRNA sequence coincides with the decreased stability within this patient’s miRNA. Overall, most substitutions would not change the structure of the miRNA. However, as Bennasser et al. (2004)
noted, the identification of a more stable sequence opposite of the known miRNA could indicate that an additional mature miRNA sequence might exist in the structure (Bennasser et al., 2004
). As with the published pre-miR-H1 sequence, the conserved motif on the opposite of the stem loop had numerous human mRNA targets when scanned against the mRNA database at NCBI (data not shown).