The interacting effects of genetic variation in Geranium subg. Geranium (Geraniaceae) using scot molecular markers

One of the most crucial aspects of biological diversity for conservation strategies is genetic diversity, particularly in rare and narrow endemic species. Our study is the first attempt to utilize SCoT markers to check the genetic diversity in Iran. We used 115 plant samples. Our objectives were 1) to check genetic diversity among Geranium species 2) Genetic structure of the Geranium 3) Do the Geranium species exchange genes? 4) To detect isolation by distance among the Geranium species. We used traditional morphological and molecular methods to assess genetic diversity and genetic structure in the Geranium species. A total of 129 amplified polymorphic bands were generated across 13 Geranium species. The size of the amplified fragments ranged from 150 to 3000 bp. G. stepporum showed the highest values for the effective number of alleles (Ne = 1.30) and Shannon information index (I =0.35). Significant ANOVA results (P <0.01) showed differences in quantitative morphological characters in plant species. G. sylvaticum showed high genetic diversity. Mantel test showed a significant correlation (r = 0.17, p=0.0002) between genetic distance and geographical distance, so isolation by distance (IBD) occurred among the Geranium species. According to the SCoT markers analysis, G. kotschyi and G. dissectum had the lowest similarity, and the species of G. sylvaticum and G. pratense had the highest similarity. The present study revealed that a combination of morphological and SCoT methods could distinguish the species of Geranium


INTRODUCTION
Genetic diversity helps to understand species characteristics and adaptation strategy in an ever-changing environment and aids in understanding the evolutionary relationship among species (Erbano et al. 2015). Several programs have been launched to conserve plant diversity while utilizing and preserving plant genetic materials (Gomez et al. 2005). Given the importance of genetic diversity in conservation strategies and programs, it is necessary to study genetic diversity in plant species, particularly threatened and rare species (Cires et al. 2013).
Population size is a pivotal factor to fathom genetic diversity because it disentangles the variation in a gene (Ellegren and Galtier 2016;Turchetto et al. 2016). Genetic variation and diversity are essential parameters for species survival; usually, individuals cannot exchange genetic materials due to geographical and genetic barriers. Therefore, this could generate a scattered population. Since these individuals have limited gene flow, there is a greater chance of a decline in population size (Frankham 2005).
Around 325 species of Geranium L. occur in the world (Aedo et al. 1998). Geranium species have medicinal and horticulture uses; henceforth, some systematic studies were conducted to better utilize Geranium species in plant systematics and plant industry (Aedo 1996). Recent classification system divides Geranium into three subgenera (Yeo 2008). Among them, subgenera Geranium has 300 species (Aedo and Estrella 2006). G. sect. Dissecta occurs in the Eurasian, Mediterranean, and Himalaya regions. The majority of Tuberosa (Boiss.) members are found in Western Europe, Central Asia, and Northwest Africa. Vegetative characters aid to classify Tuberosa into subsections Tuberosa (Boiss.) Yeo and Mediterranea R. Knuth (Yeo 2008). Previous studies identified the center of diversity of the G. subsect. Tuberosa in Iran and Turkey (Aedo and Estrella 2006;Aedo et al. 2007;Esfandani-Bozchaloyi et al. 2018a, 2018b, 2018c, 2018d. The geranium genus has twenty-two to twentyfive species in Iran (Schonbeck-Temesy 1970;Onsori et al. 2010). Leaves and fruit morphology are valid characters to identify the Geranium species (Salimi Moghadam et al. 2015). Nonetheless, advancement in molecular science has revolutionized plant systematics and taxonomy to provide authentic results.
Start codon targeted (SCoT) polymorphism is one of the latest addition in molecular science. SCoT is a simple DNA marker system. It works on the short conserved region in plant genes surrounding the ATG (Collard and Mackill 2009) translation start codon (Collard and Mackill 2009). Start codon targeted (SCoT) is affordable and produces reliable results and robust genetic profile of plant species (Collard and Mackill 2009, Wu et al. 2013, Luo et al. 2011. It is essential to mention that Iran is the center of the diversity of Geranium species. However, no study has been conducted to study genetic diversity via the SCoT molecular system. Our study is the first attempt to utilize SCoT markers to check the genetic diversity in Iran. We used 115 plant samples. Our objectives were 1) to check genetic diversity among Geranium species 2) Genetic structure of the Geranium 3) Do the Geranium species exchange genes? 4) To detect isolation by distance among the Geranium species

Plant materials
We collected thirteen Geranium species from different parts of Iran (Table 1, Figure 1). Morphological and molecular methods were used to study Geranium species. One hundred fifteen plant samples (5-10 per plant species) were examined for morphometric analyses. We collected the following species for our study purpose. Mediterranea R. Knuth). G. columbinum L., G. rotundifolium L., G. collinum Stephan ex Willd, G. sylvaticum L., G. pratense (sec. Geranium). Different occurrence records were checked and correct identification of species was carried out by Khayatnezhad in Iran. (Davis 1967, Schonbeck-Temesy 1970Zohary 1972, Aedo et al. 1998b, Janighorban 2009). We mentioned the sampling sites details in Table 1. Plant specimen vouchers were deposited in the Herbarium of Azad Islamic University (HAIU).

Morphometry
We studied 21 qualitative and 19 quantitative plant morphology characters. Data were transformed (Mean= 0, variance = 1), before ordination (Podani 2000). Euclidean distance was implemented to cluster and ordinate plant species

Dna extraction and SCoT assay
We isolated DNA from fresh leaves. Leaves were dried. The extraction of DNA was carried out in accordance with the previous procedure. (Esfandani-Bozchaloyi et al. 2019). An agarose gel was used to validate the purity of the DNA. 25 SCoT primers were used (Collard & Mackill (2009). Among them, we selected ten primers that had simple, expanded, and rich polymorphism bands (Table 2). Overall, the polymerase chain reaction contained 25μl volume. This 25 volume included ten milliliters of Tris-HCl buffer, 500 milliliters of KCl, 1.5 milliliters of MgCl2, 0.2 milliliters of each dNTP, 0.2 milliliters of a single primer, 20 ng genomic DNA, and three units of Taq DNA polymerase. (Bioron, Germany). We observed the following cycles and conditions for the amplification. At 94°C, a five-minute initial denaturation step was performed, followed by forty cycles of one minute at 94°C. Then 1-minute cycle was at 52-57°C followed by two minutes at 72°C. In the end, the final extension step was performed for seven to ten minutes at 72°C.
We confirmed the amplification steps while observing amplified products on a gel. A 100 base pair molecular ladder/standard was used to validate the scale of each band. (Fermentas, Germany).

Data analyses
We used the Ward methods and the Unweighted pair group approach with arithmetic mean (UPGMA). Multidimensional scaling and principal coordinate analysis were also used (Podani 2000). Analysis of variance (ANOVA) was used to determine the morphological differences between species and populations. PCA analysis (Podani 2000) was done to find the variation in plant population morphological traits. The PAST program, version 2.17, was used to perform multivariate and all required calculations (Hammer et al. 2001). We encoded SCoT bands as present and absent. The appearance and absence of bands were indicated by the numbers 1 and 0. We calculated all necessary parameters to study genetic diversity. In addition to genetic diversity parameters, we also assessed the marker index (MI) of primers because MI detects polymorphic loci (Ismail et al. 2019). Marker index was calculated according to the previous protocol (Heikrujam et al. 2015). The effective multiplex ratio (EMR) and the number of polymorphic bands (NPB) were calculated. Gene diversity-associated characteristics of plant samples were calculated. Nei's gene diversity (H), Shannon information index (I), number of effective alleles (Ne), and percentage of polymorphism (P% = number of polymorphic loci/number of total loci) were measured (Shen et al. 2017). Unbiased expected heterozygosity (UHe), and heterozygosity were assessed with the aid of GenAlEx 6.4 software (Peakall and Smouse 2006). Neighbor-joining (NJ) and networking were studied to fathom genetic distance plant populations (Freeland et al. 2011). The Mantel test was carried out to find the correlation between genetic and geographical distances (Podani 2000). Our goal was to know the genetic structure and diversity. Therefore, we also investigated the genetic difference between populations by analyzing molecular variance (AMOVA) in GenAlEx 6.4 (Peakall and Smouse 2006). Furthermore, gene flow (Nm) was estimated through Genetic statistics (G ST) in Pop Gene ver. 1.32 (Yeh et al. 1999). We also did STRUCTURE analysis to detect an optimum number of groups. For this purpose, the Evanno test was conducted (Evanno et al. 2005). It is a common approach to measure genetic divergence or genetic distances through pair-wise F ST and related statistics. The Mantel test detects spatial processes that shape population structure. We used PAST software ver. 2.17 to calculate the Mantel test ( (Hammer et al. 2012 ). For the Mantel test, SCoT data was used to measure Nei genetic distance, whereas geographical data was used to calculate the geographic distances in PAST software. It is calculated based on the sum of the paired differences among both longitudes and latitudes coordinates of the studied populations. The Mantel test, as originally formulated in 1967, is given by the following formula.
Where g ij and d ij are, respectively, the genetic and geographic distances between populations i and j, considering n populations. Because Z m is given by the sum of products of distances its value depends on how many populations are studied, as well as the magnitude of their distances

Morphometry
Significant ANOVA results (P <0.01) showed differences in quantitative morphological characters in plant species. Different clustering and ordination methods showed similar patterns. Therefore, UPGMA clustering and PCA plot of morphological characters are presented here (Fig. 2, 3). In general, plant samples of each species belong to a distinct section, were grouped, and formed a separate cluster. This finding indicates that the morphological characteristics examined may distinguish the Geranium species into two main clusters or classes. We did not observe any intermediate types in the specimens. In general, the UPGMA tree produced two large groups (Fig. 2). The morphological characters PCA plot (Fig. 3) clearly divided the species into distinct groups with no intermixing. This is consistent with the UPGMA tree that was previously described.

Species identification and genetic diversity
Ten SCoT primers were screened to study genetic relationships among Geranium species; all the primers produced reproducible polymorphic bands in all 13 Geranium species. An image of the SCoT amplification generated by SCoT-17 &14 primers is shown in figure 4. A total of 129 amplified polymorphic bands were generated across 13 Geranium species. The size of the amplified fragments ranged from 150 to 3000 bp. G. stepporum showed the highest values for the effective number of alleles (Ne = 1.30) and Shannon information index (I =0.35) ( Table 3). We reported genetic difference among the Geranium species as indicated by AMOVA (P = 0.01) test results. 65% of the total variation was among species, and 35% was within species. Pair-wise, FST values showed a significant difference among all studied species (Table 4). Moreover, genetic differentiation of these species was demonstrated by significant Nei's GST (0.44, P = 0.01) and D_est values (0.155, P = 0.01).
High genetic diversity was observed within species (Fig. 5) G. sylvaticum (sp5) showed high genetic diversity, as supported by diversity profiles (Table 3). The PCA plot successfully separated the species into groups. It shows the application of SCoT molecular markers to differentiate Geranium species. PCA results strongly support the AMOVA and genetic diversity results. Nm results showed 0.21 value. It indicates limited gene flow among Geranium species.
Mantel test with 5000 permutations showed a significant correlation (r = 0.17, p=0.0002) between genetic distance and geographical distance, so isolation by distance (IBD) occurred among the Geranium species.
Nei's genetic identity and the genetic distance results showed genetic distances among the species (Table is not included). G. sylvaticum and G. pratense (sect. Geranium). were genetically identical (0.93). The lowest degree of genetic similarity occurred between G. kotschyi and G. dissectum (0.47).

The species genetic structure
To determine the optimum number of genetic groups, we used STRUCTURE analysis followed by the Evanno test. In the Geranium population, we used the admixture model to show interspecific gene flow and ancestrally shared alleles. STRUCTURE analysis followed by the Evanno test produced ΔK =6. The STRUCTURE plot (Fig. 6) revealed   more information about the genetic structure of the Geranium species and shared ancestral alleles and gene flow between Geranium species. This plot revealed that Genetic affinity between G. sylvaticum and G. pratense (similarly colored) and G. ibericum and G. gracile (similarly colored) are due to shared common alleles. This is in agreement with the Neighbor joining dendrogram pre-sented before. The other species are distinct in their allele composition and differed genetically from each other.
The low Nm value (0.21) suggests limited gene flow between the Geranium species and supports genetic stratification as indicated by K-Means and STRUCTURE analyses. The population assignment test also coincided with Nm result. We could not detect substantial gene flow among the Geranium species. However, we obtained SCoT and morphological trees (consensus tree) (Figure not included). STRUCTURE plot results showed the high degree of genetic stratification in the Geranium species.

Species identification and taxonomic consideration
In phylogenetic systematics, ecology, biogeography, and biodiversity, plant species identification is a central Table 3. Genetic diversity parameters in the studied Geranium species. (N = number of samples, Na = Number of different alleles, Ne = number of effective alleles, I= Shannon's information index, He = gene diversity, UHe = unbiased gene diversity, P%= percentage of polymorphism, populations).   theme. Several evolutionary processes operate to form new species. Usually, gene flow occurs between phylogenetically closely related species (Schluter 2001, Duminil and Di Michele 2009, Ji et al. 2020, Sun et al. 2021, Niu et al. 2021, Zou et al. 2019. Genetic diversity and species differentiation is the outcome of isolation by distance, local adaptation, and gene flow (Freeland et al. 2011, Frichot et al. 2013 The Geranium is a relatively complex taxonomic group, and several morphological characters make it difficult to identify and classify Geranium species (Wondimu et al. 2017). Given the complexity, it is necessary to explore other methods that could complement the traditional taxonomical approach (Erbano et al. 2015). We examined genetic diversity in Geranium by morphological and molecular methods. We mainly used SCoT markers to investigate genetic diversity and genetic affinity in Geranium. Our clustering and ordination techniques showed similar patterns. Morphometry results clearly showed the utilization or significance of morphological characters in Geranium species. PCA results also confirmed the application of morphological characters to separate Geranium species. The present study also highlighted that morphological characters such as length, bract length, and stipule length could delimit the Geranium group. The Geranium species highlighted morphological differences. We argue that such a dissimilarity was due to differences in quantitative and qualitative traits. Interestingly, STRUCTURE results showed the presence of shared alleles in Geranium species. This existence of shared alleles is related to self-pollination in Geranium (Williams et al. 2000). Some Geranium members are also pollinated by bees, flies, and honey bees (Lefebvre et al. 2019). Present findings revealed limited gene flow, and it is quite logical to report low gene flow. Similar low gene flow values were recorded while using RAPD markers (Fischer et al. 2000). Other probable reasons for limited gene flow are geographical isolation (Fischer et al. 2000) among the Geranium species and population. Low or limited gene flow results were according to the Mantel test results. The Mantel test indicated a positive correlation between genetic and geographical distances. Therefore, it is concluded that isola-tion by distance and limited gene determines the Geranium population genetic structure.
SCoT data revealed a minimal amount of gene flow among the studied species. It was also supported by STRUCTURE analysis as Geranium species mostly had distinct genetic structures. Reticulation analysis also showed some degree of gene flow in Geranium species. We did not observe any intermediate forms in our extensive plant collection, but morphological variability within each species did occur to some extent.
Current findings showed a significant correlation between genetic and geographical distances. Our findings revealed that isolation by distance (IBD) existed between Geranium species (Mantet test results). The magnitude of variability among Na, Ne, H, and I indices demonstrated a high level of genetic diversity among Geranium species. Dendrogram and principal component analysis results showed a clear difference among Geranium species. This shows the high utilization of the SCoT technique to identify Geranium species. Our results have implications for conservation and breeding programs.

CONCLUSIONS
The present study investigated the molecular variation of 13 species. Molecular and morphometric analysis confirmed morphological and genetical difference between Geranium species. This was first attempt to assess genetic diversity through SCoT molecular markers and morphometric analysis in Iran. The current study reported two significant clusters. These two major groups were separated on the basis of genetic and morphological characters. The genetic similarities between 13 species was estimated from 0.47 to 0.93. SCoT molecular markers analysis, showed that G. kotschyi and G. dissectum had the lowest similarity. Current study also reported correlation between genetic and geographical distances. This clearly indicated isolation mechanism involved in the ecology of Geranium species. Present results showed the potential of Start Codon Targeted to assess genetic diversity and genetic affinity among Geranium species. Current findings have implications in biodiversity and conservation programs. Besides this, present results could pave the way for selecting suitable ecotypes for forage and pasture purposes in Iran.