In order to investigate whether the overall distribution and accumulation of small RNA is affected by the interaction between different V. vinifera genotypes [Cabernet Sauvignon and Sangiovese ] and environments [Bolgheri , Montalcino and Riccione ], we investigated the regions in the grapevine genome from where a high number of small RNAs were being produced , by applying a proximity-based pipeline to group and quantify clusters of small RNAs as described by Lee et al. . The nuclear grapevine genome was divided in 972,413 adjacent, non-overlapping, fixed-size windows or clusters. To determine the small RNA cluster abundance, we summed the hits-normalized-abundance values of all the small RNAs mapping to each of the 500 bp clusters, for each library . To reduce the number of false positives, we considered a cluster as expressed when the cluster abundance was greater than the threshold for a given library, eliminating regions where few small RNAs were generated, possibly by chance. Libraries from bunch closure, representing green berries, and 19 ◦Brix representing ripened berries, where used in this analysis. From the 972,413 clusters covering the whole grapevine genome, 4408 were identified as expressed in at least one sample. As showed in Figure 1, CS-derived libraries have a higher number of expressed clusters when compared to SG-derived libraries of the same developmental stage and from the same vineyard. The exceptions were the Sangiovese green berries collected in Riccione and Sangiovese ripened berries collected in Montalcino, big plastic pots which have a higher number of expressed clusters than the respective CS ones.
When Cabernet berries were green, a higher number of sRNA-generating regions were found active in Bolgheri than in Montalcino and Riccione. Differently, ripened berries had the highest number of sRNAproducing regions expressed in Riccione, while Bolgheri and Montalcino show a similar level of expressed clusters . Sangiovese green berries instead show the highest number of active sRNA-generating regions in Riccione, and this number is twice the number found in Bolgheri and Montalcino that is similar. Ripened berries collected in Montalcino and Riccione show almost the same high level of sRNA-generating clusters, whereas those collected in Bogheri present a lower number . We also noted that when cultivated in Bolgheri, neither Cabernet Sauvignon or Sangiovese change dramatically the number of expressed clusters during ripening, while in Riccione Cabernet Sauvignon shows a 2-fold increase of sRNAproducing clusters, which is not observed in Sangiovese. Next, the small RNA-generating clusters were characterized on the basis of the genomic regions where they map, i.e., genic, intergenic and transposable elements. In general, when the berries were green, the numbers of sRNA-generating loci located in genic and intergenic regions were roughly equal in all environments and for both cultivars, except for Sangiovese berries collected in Riccione, which show a slight intergenic disposition of sRNA-producing regions . Differently, in ripened berries on average 65% of the sRNA-generating loci were in genic regions, indicating a strong genic disposition of the sRNA-producing clusters . The shift of sRNA-producing clusters from intergenic to mostly genic is more pronounced in Cabernet Sauvignon berries collected in Riccione, with an increase of approximately 20% of expressed clusters in genic regions when berries pass from the green to the ripened stage.
When comparing the clusters abundance among libraries, we found that 462 clusters were expressed in all libraries. The remaining 3946 expressed clusters were either shared among groups of libraries or specific to unique libraries. Interestingly, 1335 of the 4408 expressed clusters were specific to Riccione-derived libraries . The other two environments showed a much lower percentage of specific clusters, 263 and 140 in Bolgheri and Montalcino respectively . Comparing the expressed clusters between cultivars or developmental stages, we did not observe a similar discrepancy of specific clusters toward one cultivar or developmental stage; roughly the same proportion of specific clusters was found for each cultivar and for each developmental stage . Among the 1335 specific clusters of Riccione, 605 were specific to Cabernet Sauvignon ripened berries of and 499 to Sangiovese green berries. Other smaller groups of expressed clusters were identified as specific to one cultivar, one developmental stage or also one cultivar in a specific developmental stage. When comparing the expressed clusters with the presence of transposable elements annotated in the grapevine genome , we noticed that approximately 23% of the sRNA-generating regions were TE-associated. Sangiovese green berries from Riccione have the highest proportion of TE-associated expressed clusters, while Cabernet Sauvignon ripened berries also from Riccione show the lowest proportion of TE associated expressed clusters. Sangiovese berries have the highest percentage of expressed clusters located in TE when cultivated in Riccione, compared to the other two vineyards. Interestingly, Cabernet Sauvignon berries show the lowest proportion of TE-associated clusters when growing in Riccione , independently from the berry stage. In all the libraries, Long Terminal Repeat retrotransposons were the most represented TE. More specifically, the gypsy family was the LTR class associated with the highest number of sRNA hotspots. The other classes of TE associated with the sRNA-generating regions can be visualized in Figure 3B.To determine the global relationship of small RNA-producing loci in the different environments, cultivars and developmental stages, we performed a hierarchical clustering analysis. As showed in Figure 4, the libraries clearly clustered according to the developmental stage and cultivar and not according to the environments.
Ripened and green berries had their profile of sRNA-generating loci clearly distinguished from each other. Inside each branch of green and ripened samples, Cabernet Sauvignon and Sangiovese were also well separated, indicating that, the cultivar and the stage of development in which the berries were sampled modulate the outline of sRNA-producing loci more than the environment. Notwithstanding the evidence that developmental stage and variety have the strongest effect in terms of distinguishing samples clustering, we were interested to verify the environmental influence on small RNA loci expression in the two cultivars. Thus, for each sRNA-generating cluster we calculated the ratio between cluster abundance in Cabernet Sauvignon and Sangiovese in each environment and developmental stage, thereby revealing the genomic regions with regulated clusters, considering a 2-fold change threshold, a minimum abundance of 5 HNA in each library and a minimum sum of abundance of 30 HNA . Figure 5 shows how different environments affect the production of small RNAs. In Bolgheri, regardless the developmental stage there were many clusters with a very high abundance level in Cabernet Sauvignon . In Montalcino and even more in Riccione we also observed differences between the expressions of clusters in the two cultivars, with ripened and green berries showing an almost opposite profile in terms of number of clusters more expressed in Cabernet Sauvignon or Sangiovese. When the berries were green, in Montalcino Cabernet Sauvignon shows the highest number of up-regulated clusters, while in Riccione, Sangiovese has the highest number of up-regulated clusters. The opposite behavior was noticed in ripened berries, with Sangiovese having the highest number of up-regulated clusters in Montalcino and Cabernet Sauvignon in Riccione . Notably, we observed a small percentage of regulated clusters exhibiting at least a 10-fold higher abundance of small RNA in Cabernet Sauvignon or Sangiovese when compared to each other . An examination of those clusters showed that a substantial difference could exist between the cultivars, depending on the vineyard and the developmental stage. For example, in Riccione, a cluster matching a locus encoding a BURP domain-containing protein showed a fold change of 390 when comparing green berries of Sangiovese with Cabernet Sauvignon. The small RNAs mapping in this region were mainly 21-nt and produced from both strands . Similarly, growing berries in containers the majority of the highly differentially expressed clusters showed a similar profile: strong bias toward 21-nt sRNAs and a low strand bias. These findings suggest that these small RNAs might be the product of RDR polymerase activity rather than degradation products of mRNAs. We applied a pipeline adapted from Jeong et al. and Zhai et al. to identify annotated vvi-miRNAs, their variants, novel species-specific candidates and, when possible, the complementary 3p or 5p sequences. Starting from 25,437,525 distinct sequences from all the 48 libraries, the first filter of the pipeline removed sequences matching t/r/sn/snoRNAs as well as those that did not meet the threshold of 30 TP4M in at least one library or, conversely, that mapped in more than 20 loci of the grapevine genome . Only sequences 18–26-nt in length were retained. Overall, 27,332 sequences, including 56 known vvi-miRNAs, passed through this first filter and were subsequently analyzed by a modified version of miREAP as described by Jeong et al. . miREAP identified 1819 miRNA precursors producing 1108 unique miRNA candidates, including 47 known vvi-miRNA. Next, the sequences were submitted to the third filter to evaluate the single-strand and abundance bias retrieving only one or two most abundant miRNA sequence for each precursor previously identified.
A total of 150 unique miRNA corresponding to 209 precursors were identified as candidate miRNAs. Among these 209 candidate precursors, 61 belonged to 31 known vvi-miRNA that passed all the filters and 148 were identified as putatively novel miRNA candidates. To certify that they were novel candidates rather than variants of known vvi-miRNAs we compared their sequences and coordinates with the miRNAs registered in miRBase . In order to reduce false positives and the selection of siRNA-like miRNAs, we considered only 20, 21, and 22 nt candidates whose stemloop structures were manually evaluated . Eventually, 26 miRNAs homologous to other plant species were identified with high confidence. Twenty-two were new members of nine known V. vinifera families, whereas the other four belong to three families not yet described in grapevine . For 16 homologs we were able to retrieve also the complementary sequence. Finally, excluding these 26 miRNAs and other si-RNA like miRNAs, we identified 7 completely novel bona fide miRNAs. Apart from the 61 known vvi-miRNAs identified by the pipeline, we searched the dataset for others known vvi-miRNAs eliminated throughout the pipeline, looking for isomiRs that were actually more abundant than the annotated sequences. Their complementary 3p or 5p sequence was also retrieved when possible. Hence 89 known vvi-miRNAs were identified in at least one of our libraries . Among the known vvi-miRNAs identified, 24 had an isomiR more abundant than the annotated sequence and 4 have the complementary sequence as the most abundant sequence mapping to their precursor. We found 16 vvi-miRNA isomiRs that were either longer or shorter than the annotated sequence, 7 vvi-miRNAs that mapped in the precursor in a position shifted with respect to the annotated ones and one miRNA that contains a nucleotide gap when compared to the annotated sequence . An extreme case of shifted position was found in vvi-miRNA169c, where the annotated sequence had only 5 TP4M when summing its individual abundance in the 48 libraries. Another sequence, shifted 16 bp as compared to its annotated position on the precursor had an abundance sum of 1921 TP4M, and was retained together with the annotated sequence, and named vvi-miRNA169c.1. For 36 of the 48 V. vinifera miRNA families deposited in miRBase we found at least one member. An in silico prediction of miRNA targets was performed for the 191 mature miRNAs here identified. Using the miRferno tool , and considering only targets predicted with high stringency, 1192 targets were predicted for 143 miRNAs, including six completely novel vvi-miRNA candidates . Two novel candidates seem to be involved in the regulation of important secondary metabolites biosynthesis. Among the six targets predicted for grape-m1191, the TT12 gene is known to be involved in the vacuolar accumulation of proanthocyanidins in grapevine . For grape-m1355, 12 targets were predicted and all of them are involved in secondary metabolism pathways. Nine targets code a bifunctional dihydroflavonol 4-reductase that is responsible for the production of anthocyanins , catalyzing the first step in the conversion of dihydroflavonols to anthocyanins . Another targeted gene codes a phenylacetaldehyde reductase which, in tomato, was demonstrated to catalyze the last step in the synthesis of the aroma volatile 2-phenylethanol, important for the aroma and flavor . Still this same miRNA candidate was predicted to target with high confidence a cinnamoyl reductaselike protein that is part of polyphenol biosynthetic pathway .