Genetic diversity and relationships among Hypericum L. species by ISSR Markers: A high value medicinal plant from Northern of Iran

Hypericum L. species are generally known locally in Iran with the names “Hofariqun” which Ebn Sina (or Bo Ali Sina) called it. Plants of the genus Hypericum have traditionally been used as medicinal plants in various parts of the world. Hypericum perforatum L. is the source to one of the most manufactured and used herbal preparations in recent years, especially as a mild antidepressant. Therefore, due to the importance of these plant species, we performed a molecular data for this species. For this study, we used 175 randomly collected plants from 17 species in 9 provinces. Amplification of genomic DNA using 10 primers produced 141 bands, of which 127 were polymorphic (95.78%). The obtained high average PIC and MI values revealed high capacity of ISSR primers to detect polymorphic loci among Hypericum species. The genetic similarities of 17 collections were estimated from 0.617 to 0.911. According to Inter-Simple sequence repeats (ISSR) markers analysis, H. androsaemum and H. hirtellum had the lowest similarity and the species of H. perforaturm and H. triquetrifolium had the highest similarity. The aims of present study are: 1) can ISSR markers identify Hypericum species, 2) what is the genetic structure of these taxa in Iran, and 3) to investigate the species inter-relationship? The present study revealed that ISSR markers can identify the species.


INTRODUCTION
Identifying the accurate boundaries of a species is critical to have a better perspective of any biological studies. Therefore, species delimitation is a subject of extensive part of studies in the framework of biology (Collard & Mackill 2009, Wu et al. 2013. However, defining the criterion which could address the boundaries of species is different and the place of debates (Esfandani- Bozchaloyi et al. 2018aBozchaloyi et al. , 2018bBozchaloyi et al. , 2018cBozchaloyi et al. , 2018d. Wild relatives of crops contain genes with the great potential for use in breeding programs and constitute a part of their gene pool (Pandey et al. 2008). In addition, the study of intra-specific levels of genetic variation and investigation of genetic structure of wild populations is crucial for development of effective conservation strategies.
The genus Hypericum (Guttiferae, Hypericoideae) is perennial, belonging to the Hypericaceae family, having 484 species in forms of trees, shrubs, and herbs, distributed in 36 taxonomic sections (Crockett and Robson 2011). The species of the family are distributed worldwide in the temperate zones but are absent in extreme environmental conditions such as deserts and poles. Iranian species of this genus grow mainly in north, northwest and center of Iran and form floristic elements of Hyrcanian mountainous areas, Irano-Turanian, Mediterranean and Zagros elements. They generally prefer steep slopes of rocky and calcareous cliffs and margin of mountainous forests (Robson 1968;Azadi 1999). Robson (1968) introduced 21 species in the area covered by Flora Iranica. Robson (1977) and Assadi (1984) reported H. fursei N. Robson and H. dogonbadanicum Assadi as two endemics of North and South West of Iran. In Flora of Iran, Azadi (1999) identified 19 species, 4 subspecies arranged in 5 sections (comprising Campylosporus (Spach) R. Keller, Hypericum, Hirtella Stef., Taeniocarpum Jaub. & Spach. and Drosanthe (Spach) Endl.), and two doubtful species including H. heterophyllum Vent. and H. olivieri (Spach) Boiss. Hypericum species are generally known locally in Iran with the names "Hofariqun" which Ebn Sina (or Bo Ali Sina) called it (Rechinger, 1986). St. John's wort (Hypericum perforatum L.) is the most important medicinal species of the genus and its main uses in medicine includes treatment of mild and moderate depression, skin wounds and burns (Barnes et al. 2001). The plant contains a vast array of secondary metabolites, among which naphthodianthrones (hypericin and pseudohypericin), acylphloroglucinols (hyperforin and adhyperforin) and essential oil can be mentioned (Morshedloo et al. 2012;Radusiene et al. 2005).
Molecular markers provide a powerful tool for studying the genetic diversity. Among advanced genetic markers, Random Amplified Polymorphic DNA (RAPD) and Inter Simple Sequence Repeats (ISSR) markers have been widely used for diversity analyses (Pharmawati et al. 2004). RAPD technique is quick, easy and requires no prior sequence information. The technique detects nucleotide sequence polymorphism using a single primer of arbitrary nucleotide sequence (Moreno et al., 1998). ISSR marker involves PCR amplification of DNA by a single 16-18 bp. long primer composed of a repeated sequence anchored at the 3' or 5' end of 2-4 arbitrary nucleotides. The technique is rapid, simple, inexpensive and more reproducible than RAPD (Esfandani- Bozchaloyi et al. 2017aBozchaloyi et al. , 2017bBozchaloyi et al. , 2017cBozchaloyi et al. , 2017d, (Collard & Mackill 2009, Wu et al. 2013. The present investigation has been carried out to evaluate the genetic diversity and relationships among different Hypericum species using new gene-targeted molecular markers, i.e ISSR markers . This is the first study on the use of ISSR markers in Hypericum genus; Therefore, we performed molecular study of 175 collected specimens of 17 Hypericum species. We try to answer the following questions: 1) Is there infra and interspecific genetic diversity among studied species? 2) Is genetic distance among these species correlated with their geographical distance? 3) What is the genetic structure of populations and taxa? 4) Is there any gene exchange between Hypericum species in Iran?

Plant materials
A total of 175 individuals were sampled representing 17 geographical populations belong 17 Hypericum species in East Azerbaijan, Lorestan, Kermanshah, Guilan, Mazandaran, Esfahan, Tehran, Hamadan and Kohgiluyeh and Boyer-Ahmad Provinces of Iran during July-Agust 2016-2019 (Table 1). For ISSR analysis we used 175 plant accessions (Five to twelve samples from each populations) belonging to 17 different populations with different eco-geographic characteristics were sampled and stored in -20 till further use. More information about geographical distribution of accessions are in Table 1 and Fig. 1.

Morphological studies
Five to twelve samples from each species were used for Morphometry. In total 18 morphological (11 qualitative, 7 quantitative) characters were studied. Data obtained were standardized (Mean= 0, variance = 1) and used to estimate Euclidean distance for clustering and ordination analyses (Podani 2000). Morphological characters studied are: corolla shape, bract shape, calyx shape, calyx length, calyx width, calyx apex, calyx margins, bract length, corolla length, corolla width, corolla apex, leaf length and leaf width, leaf apex, leaf margins, leaf shape, leaf gland and bract margins.

DNA Extraction and ISSR Assay
Fresh leaves were used randomly from one to twelve plants in each of the studied populations. These were dried by silica gel powder. CTAB activated char-coal protocol was used to extract genomic DNA (Esfandani- Bozchaloyi et al. 2019). The quality of extracted DNA was examined by running on 0.8% agarose gel. For the ISSR analysis, 22 primers from UBC (University of British Columbia) series were tested for DNA amplification. Ten primers were chosen for ISSR analysis of genetic variability, based on band reproducibly (Table 2). PCR reactions were carried in a 25μl volume containing 10 mM Tris-HCl buffer at pH 8; 50 mM KCl; 1.5 mM MgCl 2 ; 0.2 mM of each dNTP (Bioron, Germany); 0.2 μM of a single primer; 20 ng genomic DNA and 3 U of Taq DNA polymerase (Bioron, Germany). The amplifications , reactions were performed in Techne thermocycler (Germany) with the following program: 5 min initial denaturation step 94°C, followed by 40 cycles of 1 min at 94°C; 1 min at 52-57°C and 2 min at 72°C. The reaction was completed by final extension step of 7-10 min at 72°C. The amplification products were observed by running on 1% agarose gel, followed by the ethidium bromide staining. The fragment size was estimated by using a 100 bp molecular size ladder (Fermentas, Germany).

Morphological Studies
Morphological characters were first standardized (Mean = 0, Variance = 1) and used to establish Euclidean distance among pairs of taxa (Podani 2000). For grouping of the plant specimens, The UPGMA (Unweighted paired group using average) ordination methods were used (Podani 2000). ANOVA (Analysis of variance) were performed to show morphological difference among the populations while, PCA (Principal components analysis) biplot was used to identify the most variable morphological characters among the studied populations (Podani 2000). PAST version 2.17 (Hammer et al. 2012) was used for multivariate statistical analyses of morphological data.

Molecular Analyses
ISSR bands obtained were coded as binary characters (presence = 1, absence = 0) and used for genetic diversity analysis. Discriminatory ability of the used primers was evaluated by means of two parameters, polymorphism information content (PIC) and marker index (MI) to characterize the capacity of each primer to detect polymorphic loci among the genotypes (Powell et al. 1996). MI is calculated for each primer as MI = PIC × EMR, where EMR is the product of the number of polymorphic loci per primer (n) and the fraction of polymorphic fragments (β) (Heikrujam et al. 2015). The number of polymorphic bands (NPB) and the effective multiplex ratio (EMR) were calculated for each primer. Parameter like Nei's gene diversity (H), Shannon information index (I), number of effective alleles, and percentage of polymorphism (P% = number of polymorphic loci/number of total loci) were determined (Weising et al, 2005, Freeland et al. 2011). Shannon's index was calculated by the formula: H' = -Σpiln pi. Rp is defined per primer as: Rp = ∑ Ib, were "Ib" is the band informativeness, that takes the values of 1-(2x [0.5-p]), being "p" the proportion of each genotype containing the band. The percentage of polymorphic loci, the mean loci by accession and by population, UHe, H' and PCA were calculated by GenAlEx 6.4 software (Peakall & Smouse 2006). Nei's genetic distance among populations was used for Neighbor Joining (NJ) clustering and Neighbor-Net networking (Freeland et al. 2011, Huson & Bryant 2006. Mantel test checked the correlation between geographical and genetic distances of the studied populations (Podani 2000). These analyses were done by PAST ver. 2.17 (Hammer et al. 2012), DARwin ver. 5 (2012) soft-ware. AMOVA (Analysis of molecular variance) test (with 1000 permutations) as implemented in GenAlex 6.4 (Peakall & Smouse 2006) were used to show genetic difference of the populations. Gene flow was determined by (i) Calculating Nm an estimate of gene flow from Gst by PopGene ver. 1.32 (1997) as: Nm = 0.5(1 -Gst)/Gst. This approach considers the equal amount of gene flow among all populations.

Species identification and inter-relationship
Morphometry ANOVA showed significant differences (P <0.01) in quantitative morphological characters among the species studied. In order to determine the most variable characters among the taxa studied, PCA analysis has been performed. It revealed that the first three factors comprised over 73% of the total variation. In the first PCA axis with 57% of total variation, such characters as corolla shape, calyx shape, calyx length, bract length and leaf shape have shown the highest correlation (>0.7), leaf apex, corolla length, leaf length, leaf width were characters influencing PCA axis 2 and 3 respectively. Different clustering and ordination methods produced similar results therefore, PCA plot of morphological characters are presented here (Fig. 2). In general, plant samples of each species were grouped together and formed separate groups. This result show that both quantitative and qualitative morphological characters separated the studied species into distinct groups. In the studied specimens we did not encounter intermediate forms.

Species Identification and Genetic Diversity
Ten ISSR primers were screened to study genetic relationships among Hypericum species; all the primers produced reproducible polymorphic bands in all 17 Hypericum species. An image of the ISSR amplification generated by ISSR-5 primer is shown in Figure 3. A total of 127 amplified polymorphic bands were generated across 17 Hypericum species. The size of the amplified fragments ranged from 200 to 3000 bp. The highest and lowest number of polymorphic bands was 18 for ISSR-2 and 7 for ISSR-6, on an average of 12.7 polymorphic bands per primer. The PIC of the 10 ISSR primers ranged from 0.23 (ISSR-3) to 0.44 (ISSR-6) with an aver-age of 0.36 per primer. MI of the primers ranged from 1.37 (ISSR-9) to 4.47 (ISSR-1) with an average of 3.8 per primer. EMR of the ISSR primers ranged from 4.60 (ISSR-6) to 11.11 (ISSR-9) with an average of 8.9 per primer ( Table 2). The primers with the high EMR values were considered to be more informative in distinguishing the genotypes.
The genetic parameters were calculated for all the 17 Hypericum species amplified with ISSR primers (Table 3) AMOVA test showed significant genetic difference (P = 0.001) among studied species. It revealed that 63% of total variation was among species and 37% was within species (Table 4) Moreover, genetic differentiation of these species was demonstrated by significant Nei's GST (0.31, P = 0.001) and D_est values (0.167, P = 0.001). These results revealed a higher distribution of genetic diversity among Hypericum species compared to within species.
Different clustering and ordination methods produced similar results therefore, UPGMA clustering are presented here (Figure 4). In general, plant samples of each species belong to a distinct section, were grouped together and formed separate cluster. This result show  that molecular characters studied can delimit Hypericum species in two different major clusters or groups. In the studied specimens we did not encounter intermediate forms. In general, two major clusters were formed in UPGMA tree ( Figure. 4) In general, relationships obtained from ISSR data agrees well with species relationship obtained from morphological. This is in agreement with AMOVA and  These results indicate that ISSR molecular markers can be used in Hypericum species taxonomy. The Nm analysis by Popgene software also produced mean Nm= 0.123, that is considered very low value of gene flow among the studied species. Mantel test with 5000 permutations showed a significant correlation (r = 0.23, p=0.0002) between genetic distance and geographical distance, so isolation by distance (IBD) occurred among the Hypericum species studied.
Nei's genetic identity and the genetic distance determined among the studied species (Table 5). The results showed that the highest degree of genetic similarity  df: degree of freedom; SS: sum of squared observations; MS: mean of squared observations; EV: estimated variance; ΦPT: proportion of the total genetic variance among individuals within an accession, (P < 0.001).

DISCUSSION
Genetic diversity is an important role in biology of long-term evolution of a taxon or a population. The basis of existence, growth, and evolution of taxon. Thus, the study of genetic diversity of taxon is fundamental to recognize the taxonomy, origin, and evolution of taxon. Moreover, such research will provide a theoretical basis for the germplasm resource conservation, development, utilization, and breeding (Lubbers et al., 1991).
The present research, revealed interesting data about its genetic variability, genetic stratification and morphological divergence in north and west part of Iran. Degree of genetic variability within a species is highly correlated with its reproductive mode, the higher degree of open pollination/ cross breeding brings about higher level of genetic variability in the studied taxon (Meusel et al., 1965). PIC and MI characteristics of a primer help in determining its effectiveness in genetic diversity analysis. Sivaprakash et al. (2004) suggested that the ability of a marker technique to resolve genetic variability may be more directly related to the degree of polymorphism. Generally, PIC value between zero to 0.25 suggest a very low genetic diversity among genotypes, between 0.25 to 0.50 shows a mid-level of genetic diversity and value ≥0.50 suggests a high level of genetic diversity (Tams et al., 2005). In this research, the ISSR primers' PIC values ranged from 0.23 to 0.44, with a mean value of 0.36, which indicated a mid-level ability of ISSR primers in determining genetic diversity among the species of Hypericum. All of 10 primer pairs showed a good polymorphism in taxon of Hypericum. A total 141 alleles were recognized for the studied species. Total number of bands per primers ranged from 7 to 18 polymorphic bands and the mean of the allele number in loci was 12.7.
In most studies, population size is limited to several vegetative accession (Meusel et al., 1965;Uotila, 1996). This population could be showed genetic drift, whose effect are observed in the high level of F IS and low level of genetic diversity. The isolation of the population and absence the gene flow led to fragmentation of the Hypericum populations. Between genetic diversity parameters and population size were showing positive correlations that confirmed various studies (Leimu et al. 2006). There are two reasons for the positive correlation between genetic diversity and population size (Leimu et al., 2006). 1-A positive correlation could imply the presence of an extinction vortex, where the drop-in population size lowers genetic diversity, which leads to inbreeding depression. The second reason is the fact that plant fitness differentiates populations based on variations in habitat quality (Vergeer et al., 2003).
According to Booy et al. (2000) the low levels of genetic diversity could reduce plant fitness and restrict a population's ability to respond to changing environmental conditions through selection and adaptation. Genetic diversity (37%) was obtained within populations, whereas 63% of genetic variation obtained between the evaluated populations. One of the key factors determining the distribution of genetic variation is the breeding system in plant species (Duminil, 2007). Couvet (Booy et al., 2000) revealed that one migrant per generation cannot be existed to guarantee long-term survival of small populations and that the number of migrants is demonstrate through life history characters and population genetic (Vergeer et al., 2003).
Genetic variances between the three groups were very similar, but statistically important. There are two hypotheses for the absence of differences between isolated populations. The first hypothesis explained that genetic diversity within and between populations demonstrate gene flow processes, which led to the fragmentation of larger populations (Dostálek et al., 2010). The second hypothesis presented that geographically proximate populations are more efficiently connected through gene flow than populations separated by greater distance.
A high level of variation among H. perforatum populations was also reported by Percifield et al. (2007) which confirms results of the present study. Similar results have been reported on this species using the RAPD markers by Hazler Pilepic et al. (2008). The high genetic diversity of H. perforatum populations is as a result of its mating systems. In fact, propagation method(s) of plant species is considered as one of the most important factors determining their levels of genetic diversity (Hamrick 1982;Hamrick and Godt 1989). Self-incompatibility is a wide spread phenomenon in the genus Hypericum (Robson 1981), resulting in the high levels of genetic variability (Borba et al. 2001). Furthermore, this perennial plant produces a great number of seeds every year in favor of the high amounts of diversity in this species (Zhao et al. 2007).
Since widespread species may possess the higher levels of genetic diversity than narrowly distributed plants (Hamrick and Godt 1996;Singh et al. 1998), the wide range of H. perforatum distribution is an important factor in this respect. Considering the low level of gene flow rate among studied wild populations of H. perforatum, therefore, genetic drift might be inevitable.
In H. perforatum, the low rate of gene flow may be due to factors such as prevailing apomixes and short distance of seed dispersal as stated by Hazler Pilepic et al. (2008). Molecular markers have been used to investigate the genetic diversity, population structure, and reproductive biology of H. perforatum (Arnholdt-Schmitt, 2000;Haluŝková and Koŝuth, 2003;Barcaccia et al., 2006;Percifield et al., 2007). However, due to the lack of a specific marker system for these plants, most of the studies used marker systems such as RAPD and ISSR. In the present work, we took advantage of the ubiquity and abundance of ISSR method in plant genomes and their role in genomic diversification to develop and apply retrotransposon markers based on the ISSR method for the first time to Hypericum.
High among-population variation was previously reported in Hypericum species by Percifield et al. (2007), Pilepić et al. (2008), andFarooq et al. (2014). High differentiation among populations is mostly coupled with limited gene flow among them. The low gene flow and the high differentiation among populations has been explained mainly by founder events such as time since colonization (Jacquemyn et al., 2004).
In conclusion, the results of this study showed that to evaluate the genetic diversity of the Hypericum genus, the primers derived from ISSR were more effective than the other molecular markers. Also, Hypericum species were clearly separated from each other in the dendrogram and PCA, indicating the higher efficiency of ISSR technique in Hypericum species identification.