Genetic variations and interspesific relationships in Salvia (Lamiaceae) using SCoT molecular markers

The genus Salvia includes an enormous assemblage of nearly 1000 species dispersed around the World. Iran having 19 endemic species out of 61 is regarded as one of the important regions for Salvia diversity in Southwest Asia. Salvia species are herbaceous, rarely biennial or annual, often strongly aromatic. These species are of medicinal, commercial and horticultural value. Due to the importance of these plant species, we performed a combination of morphological and molecular data for this species. For this study, we used 145 randomly collected plants from 30 species in 18 provinces. Amplification of genomic DNA using 10 primers produced 134 bands, of which 129 were polymorphic (97.78%). The obtained high average PIC and MI values revealed high capacity of SCoT primers to detect polymorphic loci among Salvia species. The genetic similarities of 30 collections were estimated from 0.61 to 0.93. According to the SCoT markers analysis, S. tebesana and S. verticillata had the lowest similarity and the species of S. eremophila and S. santolinifolia had the highest similarity. The aims of present study are: 1) can SCoT markers identify Salvia species, 2) what is the genetic structure of these taxa in Iran, and 3) to investigate the species inter-relationship? The present study revealed that SCoT markers can identify the species.


INTRODUCTION
Identifying the accurate boundaries of a species is critical to have a better perspective of any biological studies. Therefore, species delimitation is a subject of extensive part of studies in the framework of biology (Collard & Mackill 2009, Luo et al. 2011, Wu et al. 2013. However, defining the criterion which could address the boundaries of species is different and the place of debates (Jamzad 2012). Among different populations, genetic diversity is non randomly distributed and is affected by various factors such as geographic variations, breeding systems, dispersal mechanisms, life span, etc. Change in environmental conditions often leads to variation in genetic diversity levels among different populations and populations with low variability are generally considered less adapted under adverse circumstances (Falk & Holsinger 1991, Olivieri et al. 2016. Most of the authors agree that genetic diversity is necessary to preserve the long-term evolutionary potential of a species (Falk & Holsinger 1991). In the last decade, experimental and field investigations have demonstrated that habitat fragmentation and population decline reduce the effective population size. In the same way, most geneticists consider population size as an important factor for maintaining genetic variation (Turchetto et al. 2016).
Salvia L. is known as the largest genus in Lamiaceae (Mentheae-Salviinae) with approximately 1000 species diversified in three regions of the world: Central and South America (500 spp.), Western Asia (200 spp.) and Eastern Asia (100 species) (Walker et al. 2004). Iran having 19 endemic species out of 61 is regarded as one of the important regions for Salvia diversity in Southwest Asia (Jamzad 2012). Salvia species are herbaceous, rarely biennial or annual, often strongly aromatic. These species are of medicinal, commercial and horticultural value (Safaei et al. 2016). Also, some Salvia species have pharmacological properties, including antiplatelet, antiinflammatory and antithrombotic effects (Hosseinzadeh et al. 2003, Mayer et al. 2007Fan et al. 2010). Some species of this genus are used in folk medicine, such as S. miltiorrhiza Bunge , which is used for treatment of cardiovascular diseases (Wang et al. 2007(Wang et al. , 2009. Salvia reuterana Boiss. is an endemic species which grows in the highlands of central Iran (Jamzad 2012). Its common name in Persian is "Mariam Goli Esfahani", and the aerial parts of the plant are traditionally used as sedative and anxiolytic herbal medicine. In addition, the antibacterial, antioxidant, free radical scavenging and anti-anxiety properties of this herb have been proved in recent studies (Erbano et al. 2015). The chemical composition of Salvia strongly indicates that the herb has potential to become an important raw material for anti-inflammatory compounds and knowledge of the diversity of wild populations will therefore be important to inform the use and conservation of this genus (Farag et al. 1986, Li & Quiros 2001. Genetic surveys, in particular, are key measures to efficiently access the genetic resources of species of pharmacological interest. Several markers have been previously applied to survey genetic variability within the genus Salvia (Song et al. 2010, Wang et al. 2011. Specifically, there are some important publications addressing S. miltiorrhiza, most of them utilizing dominant markers (Wang et al. 2011).
Accordingly, some researchers have tried to assess this variability by ISSR and RAPD techniques in different Salvia species (Song et al. 2010, Wang et al. 2011, Sepehry Javan et al. 2012, Zhang et al. 2013, Peng et al. 2014, Erbano et al. 2015. Sepehry Javan et al. (2012) mentioned that three major factors influencing genetic variations in Salvia are: species, geographical distribution and selection. These factors along with cross-pollination make the taxonomy and genetic relationships of Salvia species unclear (Wang et al. 2011). Morphological characteristics are easily affected by environment that makes identification of species more complex (Chen et al. 2013). The conservation and suitable use of plant genetic resources require accurate monitoring of their accessions. So, genetic characterization is essential to manifest the extent of plant genetic diversity, and also to discover better genotypes; especially in the geographically differentiated genus such as Salvia (Song et al. 2010, Peng et al. 2014, Patel et al. 2014, Kharazian et al. 2015.
With the progress in plant molecular biology, numerous molecular marker techniques have been developed and used widely in evaluating genetic diversity, population structure and phylogenetic relationships. In recent years, advances in genomic tools provide a wide range of new marker techniques such as, functional and gene targeted markers as well as develop many novel DNA based marker systems (Esfandani-Bozchaloyi et al. 2017a, 2017b, 2017c, 2017d. Start codon targeted (SCoT) polymorphism is one of the novel, simple and reliable gene-targeted marker systems. This molecular marker offers a simple DNA-based marker alternative and reproducible technique which is based on the short conserved region in the plant genes surrounding the ATG (Collard & Mackill 2009) translation start codon (Collard & Mackill 2009). This technique involves a polymerase chain reaction (PCR) based DNA marker with many advantages such as low-cost, high polymorphism and extensive genetic information (Collard & Mackill 2009, Luo et al. 2011, Wu et al. 2013.
The present investigation has been carried out to evaluate the genetic diversity and relationships among Salvia species using new gene-targeted molecular markers, i.e. SCoT. This is the first study on the use of SCoT markers in Salvia genus; Therefore, we performed molecular study of 145 specimens of 30 Salvia species.
We try to answer the following questions: 1) Is there infra and interspecific genetic diversity among studied species? 2) Is genetic distance among these species correlated with their geographical distance? 3) What is the genetic structure of populations and taxa? 4) Is there any gene exchange between Salvia species in Iran? For morphometric and SCoT analysis we used 145 plant accessions (five to twelve samples from each populations) belonging to 30 different populations with different ecogeographic characteristics were sampled and stored in -20 till further use. More information about geographical distribution of accessions are in Fig. 1.

Morphological studies
Five to twelve samples from each species were used for Morphometry (Some endemic species were collected due to the rarity of 5 to 12 numbers). In total 22 morphological (9 qualitative, 13 quantitative) characters were studied. Data obtained were standardized (Mean= 0, variance = 1) and used to estimate Euclidean distance for clustering and ordination analyses (Podani 2000). Morphological characters studied are: corolla shape, bract shape, seed color, seed shape, bract color, corolla latex, leaf surface, calyx shape, basal leaf shape, pedicel length, calyx length, bract length, filament length, anther length, corolla length, nut length, nut width, basal leaf length, basal leaf width, corolla color, stem leaf length and stem leaf width.

DNA Extraction and SCoT Assay
Fresh leaves were used randomly from three to twelve plants in each of the studied populations. These were dried by silica gel powder. CTAB activated charcoal protocol was used to extract genomic DNA (Esfandani-Bozchaloyi et al. 2019). The quality of extracted DNA was examined by running on 0.8% agarose gel. A total of 25 SCoT primers developed by Collard & Mackill (2009), 10 primers with clear, enlarged, and rich polymorphism bands were chosen (Table 1). PCR reactions were carried in a 25μl volume containing 10 mM Tris-HCl buffer at pH 8; 50 mM KCl; 1.5 mM MgCl 2 ; 0.2 mM of each dNTP (Bioron, Germany); 0.2 μM of a single primer; 20 ng genomic DNA and 3 U of Taq DNA polymerase (Bioron, Germany). The amplifications , reactions were performed in Techne thermocycler (Germany) with the following program: 5 min initial denaturation step 94°C, followed by 40 cycles of 1 min at 94°C; 1 min at 52-57°C and 2 min at 72°C. The reaction was completed by final extension step of 7-10 min at 72°C. The amplification products were observed by running on 1% agarose gel, followed by the ethidium bromide staining. The fragment size was estimated by using a 100 bp molecular size ladder (Fermentas, Germany).

Morphological studies
Morphological characters were first standardized (Mean = 0, Variance = 1) and used to establish Euclidean distance among pairs of taxa (Podani 2000). For grouping of the plant specimens, The UPGMA (Unweighted paired group using average) ordination methods were used (Podani 2000). ANOVA (Analysis of variance) were performed to show morphological difference among the populations while, PCA (Principal components analysis) biplot was used to identify the most variable morphological characters among the studied populations (Podani 2000). PAST version 2.17 (Hammer et al. 2012) was used for multivariate statistical analyses of morphological data.

Molecular analyses
SCoT bands obtained were coded as binary characters (presence = 1, absence = 0) and used for genetic diversity analysis. Discriminatory ability of the used primers was evaluated by means of two parameters, polymorphism information content (PIC) and marker index (MI) to characterize the capacity of each primer to detect polymorphic loci among the genotypes (Powell et al. 1996). MI is calculated for each primer as MI = PIC × EMR, where EMR is the product of the number of polymorphic loci per primer (n) and the fraction of polymorphic fragments (β) (Heikrujam et al. 2015). The number of polymorphic bands (NPB) and the effective multiplex ratio (EMR) were calculated for each primer. Parameter like Nei's gene diversity (H), Shannon information index (I), number of effective alleles, and percentage of polymorphism (P% = number of polymorphic loci/number of total loci) were determined (Weising et al, 2005, Freeland et al. 2011). Shannon's index was calculated by the formula: H' = -Σpiln pi. Rp is defined per primer as: Rp = ∑ Ib, were "Ib" is the band informativeness, that takes the values of 1-(2x [0.5-p]), being "p" the proportion of each genotype containing the band. The percentage of polymorphic loci, the mean loci by accession and by population, UHe, H' and PCA were calculated by GenAlEx 6.4 software (Peakall & Smouse 2006). Nei's genetic distance among populations was used for Neighbor Joining (NJ) clustering and Neighbor-Net networking (Huson & Bryant 2006, Freeland et al. 2011. Mantel test checked the correlation between geographical and genetic distances of the studied populations (Podani 2000). These analyses were done by PAST ver. 2.17 (Hammer et al. 2012), DARwin ver. 5 (2012 software. AMOVA (Analysis of molecular variance) test (with 1000 permutations) as implemented in GenAlex 6.4 (Peakall & Smouse 2006) were used to show genetic difference of the populations. Gene flow was determined by (i) Calculating Nm an estimate of gene flow from Gst by PopGene ver. 1.32 (1997) as: Nm = 0.5(1 -Gst)/Gst. This approach considers the equal amount of gene flow among all populations.

Species identification and inter-relationship
Morphometry ANOVA showed significant differences (P <0.01) in quantitative morphological characters among the species studied. In order to determine the most variable characters among the taxa studied, PCA analysis has been performed. It revealed that the first three factors comprised over 63% of the total variation. In the first PCA axis with 42% of total variation, such characters as seed shape, calyx shape, calyx length, bract length and basal leaf shape have shown the highest correlation (>0.7), seed color, leaf surface, corolla length, filament length, nut width, basal leaf length, were characters influencing PCA axis 2 and 3 respectively. Different clustering and ordination methods produced similar results therefore, PCA plot of morphological characters are presented here (Fig. 2). In general, plant samples of each species were grouped together and formed separate groups. This result show that both quantitative and qualitative morphological characters separated the studied species into distinct groups. In the studied specimens we did not encounter intermediate forms.

Species Identification and Genetic Diversity
Ten ISSR primers were screened to study genetic relationships among Salvia species; all the primers produced reproducible polymorphic bands in all 30 Salvia species. An image of the ISSR amplification generated by SCoT-11 primer is shown in Figure 3. A total of 129 amplified polymorphic bands were generated across 30 Salvia species. The size of the amplified fragments ranged from 100 to 2000 bp. The highest and lowest number of polymorphic bands were 20 for SCoT-14 and 8 for SCoT-3, on an average of 12.9 polymorphic bands per primer. The PIC of the 10 SCoT primers ranged from 0.36 (SCoT-1) to 0.55 (SCoT-14) with an average of 0.46 per primer. MI of the primers ranged from 1.65 (SCoT-11) to 5.55 (SCoT-16) with an average of 3.6 per primer. EMR of the SCoT primers ranged from 6.34 (SCoT-18) to 11.55 (SCoT-6) with an average of 8.4 per primer ( with the high EMR values were considered to be more informative in distinguishing the genotypes. The genetic parameters were calculated for all the 30 Salvia species amplified with SCoT primers (Table  2) AMOVA test showed significant genetic difference (P = 0.01) among studied species. It revealed that 66% of total variation was among species and 34% was within species (Table 3) Moreover, genetic differentiation of these species was demonstrated by significant Nei's GST (0.21, P = 0.01) and D_est values (0.177, P = 0.01). These results revealed a higher distribution of genetic diversity among Salvia species compared to within species. Marrubium anisodon and M. cuneatum (out-groups) were separated from the other species. Two major clusters were formed in WARD tree (Fig. 4). The first major cluster (A) contained two sub-clusters: S. sharifii and S. macrosiphon are separated from the other studied species and join the others with a great distance and comprised the first sub-cluster. The second sub-cluster was formed by S. xanthocheila, S. limbata, S. aethiopis, S. sclarea and S. virgate. The second major cluster also contained two sub-clusters: eight species of S. multicaulis; S. syriaca; S. viridis, S. reuterana; S. palaestina; S. sclareopsis; S. spinose and S. oligphylla were placed close to each other, while close genetic affinity between other species. In general, relationships obtained from SCoT data agrees well with species relationship obtained from morphological. This is in agreement with AMOVA and genetic diversity parameters presented before. The species are genetically well differentiated from each other. These results indicate that SCoT molecular markers can be used in Salvia species taxonomy. The Nm analysis by Popgene software also produced mean Nm= 0.167, that is considered very low value of gene flow among the studied species.
Mantel test with 5000 permutations showed a significant correlation (r = 0.13, p=0.0002) between genetic distance and geographical distance, so isolation by distance (IBD) occurred among the Salvia species studied.
Nei's genetic identity and the genetic distance determined among the studied species (Table 4). The results showed that the highest degree of genetic similarity (0.93) occurred between S. eremophila and S. santolinifolia. The lowest degree of genetic similarity occurred between S. tebesana and S. verticillata (0.66). The low Nm value (0.167) indicates limited gene flow or ancestrally shared alleles between the species studied and indicating high genetic differentiation among and within Salvia species.

DISCUSSION
Genetic diversity is a basic component of biodiversity and its conservation is essential for long term survival of any species in changing environments (Mills & Schwartz 2005, Tomasello et Xu et al. 2021, Zou et al. 2019, Wang et al. 2020. This is very important in fragmented populations because are more vulnerable due to the loss of allelic richness and increased population differentiation by genetic drift (decreases heterozygosity and eventual fixation of alleles) and inbreeding depression (increases homozygosity within populations; Frankham 2005). Therefore, knowledge of the genetic variability and diversity within and among different populations is crucial for their conservation and management (e.g. Esfandani-Bozchaloyi et al.  2018a, 2018b, 2018c, 2018d; Salari et al. 2013;Jahani et al. 2019).
In the present study we used morphological and molecular (SCoT) data to evaluate species relationship in Salvia. Morphological analyses of the studied Salvia species showed that they are well differentiated from each other both in quantitative measures (the ANOVA test result) and qualitative characters (The PCA plot result). In addition, PCA analysis suggests that characters like bract length, stipule length, bract shape, calyx shape, petal shape, length and width of stem-leaf, length and width of petal could be used in species groups delimitation. This morphological difference was due to quantitative and qualitative characters.
Genetic structure and gene flow PIC and MI characteristics of a primer help in determining its effectiveness in genetic diversity analysis. Sivaprakash et al. (2004) suggested that the ability of a marker technique to resolve genetic diversity may be more directly related to the degree of polymorphism. Generally, PIC value between zero to 0.25 imply a very low genetic diversity among genotypes, between 0.25 to 0.50 shows a mid-level of genetic diversity and value ≥0.50 suggests a high level of genetic diversity (Tams et al. 2005). In this research, the SCoT primers' PIC values ranged from 0.36 to 0.55, with a mean value of 0.46, which indicated a mid-ability of SCoT primers in determining genetic diversity among the Salvia species. Comparable but low PIC values have been reported with other markers like RAPD and AFLP in African plantain (Ude et al. 2003), ISSR and RAPD in Salvia species (Yousefiazar-Khanian et al. 2016), AFLP in wheat (Bohn et al. 1999) and SCoT markers , Pour-Aboughadareh et al. 2017. In Heikrujam et al. (2015), CBDP markers were found to be more effective than SCoT markers with regard to the average PIC which was higher. In our study, the SCoT markers were found to be effective in the estimation of different Salvia species genetic diversity with regard to average percentage polymorphism (97.78%), average PIC value of SCoT markers (0.46), average MI (3.6) and average EMR of SCoT markers (8.4), which were higher than other reported markers on Salvia (Wang et al. 2009, Song et al. 2010, Yousefiazar-Khanian et al. 2016, Gholamin and Khayatnezhad 2020. However, various marker techniques were found to have different resolution of the genome regions and the number of loci that cover the whole genome for estimating of genetic diversity (Souframanien & Gopalakrishna 2004). A diverse level of polymorphism in Salvia species using ISSR, CoRAP, SRAP, SCoT and RAPD markers had been reported earlier by Wang & Zhang (2009), Song et al. (2010), Yousefiazar-Khanian et al. (2016 and . Gene flow is inversely correlated with the gene differentiation but is very important for population evolution, and takes place by pollen and seeds between populations (Song et al. 2010). In the current study, detected gene flow (Nm) among Salvia species was 0.167, showed low genetic differentiation among Salvia species.
As a general rule, insects are the pollinators of Salvia in Old World (Claßen-Bockhoff et al. 2004, Khayatnezhad andGholamin, 2012a, b). At the lower elevations, bees and at the higher altitudes insects like flies are the dominate pollinators among bilabiate flowers such as Salvia (Pellissier et al. 2010).
According to Moein et al. (2019) genetic structure of SRAP marker showed that despite the presence of a limited gene flow, two distinct ecotypes were formed which may be the consequences of reproductive isolation   sp1 sp2 sp3 sp4 sp5 sp6 sp7 sp8 sp9 sp10 sp11 sp12 sp13 sp14 sp15 sp16 sp17 sp18 sp19 sp20 sp21 sp22 sp23 sp24 sp25 sp26 sp27 sp28 sp29 sp30 caused by altitude gradient and different niches through parapatric speciation. The heterozygosity (H) and Shannon index (I) reflect diversity and differentiation among and within the germplasm collections, respectively (Que et al. 2014), and the higher the indices, the greater the genetic diversity. The magnitude of variability among Na, Ne, H and I indices using studied SCoT markers demonstrated a high level of genetic diversity among and within Salvia species. The similar results reported in Salvia miltiorrhiza based on ISSRs (Zhang et al., 2013) and other Salvia species using AFLP markers (Sajadi et al., 2010) as 95% and 99% polymorphism, respectively. Also, polymorphism index (PI) in RAPD primers was higher; whereas, other indices like PIC, EMR and MI were somewhat high in ISSRs. On the other hand, RP index was approximately equal in both techniques. In general, small differences in terms of calculated indices showed that both techniques had similar efficiency to differentiate the closely related ecotypes of Salvia. Chen et al. (2013) reported PIC values about 0.20 in ocimum species by ISSR and RAPD markers and also showed the RP values as 1.39 and 5.13, respectively. PIC analysis can be used to select the most appropriate markers for genetic mapping. Also, the high MI reflects the marker efficiency to simultaneously analyze a large number of bands (Powell et al., 1996;Patel et al., 2014). The high average Simpson's coefficients (about 0.80) indicate high genetic variability among studied accessions of Salvia, too. This finding was similar to the study by Manica-Cattani et al. (2009) on accessions of Lippia alba by ISSR and RAPD. In their study on Salvia lachnostachys ecotypes by ISSR primers, Erbano et al. (2015) showed a range of 0.66-0.86 for Simpson's index. Comparison of Nei's similarity coefficients between ISSRs and RAPDs showed that both markers had high diagnostic capability. This is consistent with the results of ISSR markers in Mint accessions by Kang et al. (2013) and Salvia miltiorrhiza germplasms studied by Zhang et al. (2013); while the genetic similarity derived from SRAPs and ISSRs represented high proximity among Salvia miltiorrhiza populations (Song et al., 2010). Cluster analysis could group all 21 ecotypes and the results showed reasonable congruency in RAPD and ISSR in terms of species topology. Zhang et al. (2013) showed five major clusters for S. miltiorrhiza germplasms based on Nei's similarity coefficient for ISSRs; which did not indicate any clear pattern according to their locations. Patel et al. (2014) reported that in dendrograms of ISSR and RAPD, the genotypes of each Ocimum species were grouped, separately. Similar studies in populations of S. japonica and some other Salvia species (Sudarmono and Okada 2008) did not show cor-relation between morphological variations and allozyme and DNA sequences. It was concluded that S. japonica is still at the early stage of speciation process Sympatry or co-occurrence of closely related species can either result from a sympatric speciation process or from secondary contact due to range expansion after speciation. Under the allopatric scenario, genetic variation tends to be uniform across the genome due to a large proportion of the genome changing through a combination of divergent selection, differential response to similar selective pressures and genetic drift (see for example Strasburg et al. 2012). In contrast, in the extreme case of sympatric speciation, gene flow between the incipient species can homogenize most of the genome, except for loci that experience strong divergent selection pressures or regions that are tightly linked with these loci (see for example, Strasburg et al. 2012, Via 2012.
In conclusion, the results of this study showed that to evaluate the genetic diversity of the Salvia genus, the primers derived from SCoT were more effective than the other molecular markers. Also, Salvia ecotypes/species were clearly separated from each other in the dendrogram and MDS, indicating the higher efficiency of SCoT technique in Salvia species identification.