Authentication, genetic fingerprinting and assessing relatedness of rice (Oryza Sativa) genotypes by SSR molecular markers

Rice (Oryza sativa L.), is a staple food and cash crop in many countries and studies on geneticstructure and differentiation patterns of rice land races along with the cultivated rice, provide important data for future rice breeding. Therefore, the aims of present investigation were 1-To study the genetic diversity present withinIranian rice genotypes, 2-To study genetic relatedness of these rice genotypes, and 3-To providebarcoding of the rice genotypes based on SSR molecular markers and produce data for rice varieties authentication. In total, 201 rice samples originated from 10 geographical regions of Iran were studied in this project. All rice samples underwent fragment analysis in every 64 SSR loci and different clustering and ordination methods performed. In general four major clusters were formed. Both landraces as well as rice cultivars were distributed in different clusters due to their genetic difference. STRUCTURE analysis of the studied genotypes followed by Evanno test produced the optimal number of genetic groups K = 2. The mean Nm = 13.6, for the studied genotypes indicates that a high degree of gene flow/ancestral common alleles are present in the rice genotypes studied. Mantel test indicated a significant positive association between genetic distance and geographic distance of the rice genotype studied and presence of an overall isolation by distance (IBD) model of differentiation across the geographical regions of Iran. Overall, the significant genetic difference observed between rice landraces and rice cultivars ofthe country may be used in future hybridization and breeding of rice in the country. The landracerice genotypes may contain useful genes to be transferred to the popular rice cultivars. Moreover, SSR loci that can differentiate rice genotypes are identified and can be used in rice cultivars authentication.


INTRODUCTION
Rice (Oryzasatavia) is a diploid annual grass (2n = 24) of the family Poaceae, which is an important food crop with the highest production after sugarcane and maize. Though it was originated in China, nowadays has several wild and related genotypes, many landraces and cultivated forms throughout the world (Henga et al. 2018).
Successful breeding strategies for rice, requires a deep knowledge on the genetic diversity of rice cultivars, landraces and related genotypes within each country. Rice is an important food and cash crop in Iran, with several landraces and cultivars that are grown and cultivated in different regions of the country. We have however, limited data available on genetic structure and genetic diversity of Iranian rice (see for example, Nasabi et al. 2012).
Rice plant has suffered great genetic diversity reduction (about 80%), from that of the wild ancestor during the domestication as well as local artificial selection processes. This genetic erosion in the high-yielding rice varieties, results in disease susceptibility, and the loss of suitable genes (Cuiet al. 2017).
By contrast, the rice landrace, is a local variety which becomes adapted to the natural and cultural environment in which it grows. Landrace populations contain relatively high level of genetic variability compared to the cultivated rice, and therefore provide a valuable source of potentially useful genes for rice breeding (Cui et al. 2017). Therefore, studies on genetic structure and differentiation patterns of rice landraces along with the cultivated rice, provide important data for future rice breeding , Henga et al. 2018.
Rice varieties are among the most important human food resources. Different rice varieties have specific agronomic characteristics, cooking properties, local adaptation, marketing demands, as well as ideas and pest resistance. Some of the varieties are aromatic for example, Thai fragrant rice, Vietnamese fragrant rice, Basmati rice, etc. and therefore, authentication of rice is of immediate importance in the rice industry , Henga et al. 2018.
Genetic markers are very useful in managing germ plasm, investigating the genetic variability orgenetic finger printing of crop plants including rice , Henga et al. 2018.
Molecular finger printing and genetic purity assessment of rice genotypes is vital for seed certification related to genotype distinctness, and seeds uniformity (Henga et al. 2018).
Therefore, the aims of present investigation were: 1) To study the genetic diversity present within Iranian rice genotypes, 2) To study genetic relatedness of these rice genotypes, and 3) To provide barcoding of the rice genotypes based on SSR molecular markers and produce data for rice varieties authentication. SSR markers are composed of tandem repeated nucleotides with 2-6 bp length, which can be amplified using the unique flanking region for primers annealing. These molecular markers are highly reproducible and polymorphic, and have been used a sideal marker in rice varieties genotyping. SSR markers can be utilized for paternity analysis, population genetics investigation, construction of high-density genome maps, germ plasm evaluation as well as marker-assisted selection (Ma et al. 2011;Henga et al. 2018).

Samples
In total, 201 rice samples were studied in this project. Details of the samples obtained and their sources are as follows: One hundred and twenty-one rice samples in the form of panicles were received from Iran Rice Research Institute of Iran (RRII). Twenty-three rice samples were kindly provided by Iran Food and Drug Administration (IFAD) and Iranian Rice Importers Association (IRIA), cooperatively. Thirty-five rice seed samples were dedicated from the International Rice Research Institute (IRRI), Philippines and finally twenty-two parboiled rice grain samples were collected from the market (Table S1).

DNA isolation and PCR amplification
The genetic material was extracted using the QIAampDNeasy Mini Kit (QIAGEN, Germany) that works based on silica gel membrane technology which allowed an efficient recovery ofcomplete DNA from plant tissues. DNA was extracted from each of the samples between 3-5 times.

Gel electrophoresis
Amplified DNA fragments were electrophoresed on 2% agarose gels containing safe dyes and 1XTAE buffer was used for this purpose, and bands then were visualized by UV transillumination system.

Fragment analysis using QIAXCEL
All 201 rice samples underwent fragment analysis in every 64 SSR loci. QIAXCEL fragment analyzer (QIA-GEN, Germany) was used for these verifications. QIAxcel DNA High Resolution DNA Kit with an accuracy of 3-5 bp was used to run the samples in capillaries which are filled with agel-matrix with a proprietary linear polymer with ethidium bromide intercalating dye. The QIAxcel Screen Gel® software which is been employed by QIAxcel Advanced capillary electrophoresis system was used to estimate the size of each fragment and to do the interpretations.

Data analyses
The SSR bands obtained were treated as binary characters and coded accordingly (presence = 1, absence = 0). The grouping of the rice genotypes were done by using different clustering and ordination methods (Podani 2000). For clustering, we used Nei and Li distance as well as Jaccard similarity index (Podani 2000). These analyses were performed by PAST version 2.17 (Hammer et al. 2012).
We investigated the genetic structure of the rice samples by model-based clustering, based on the admixture ancestry model under the correlated allele frequency model, as performed by STRUCTURE software ver. 2.3 (Pritchard et al. 2000). Data were scored as dominant markers and analysis followed the methods uggested by Falush et al. (2007).
The Markov chain Monte Carlo simulation was run 20 times for each value of K (1-4) for 20 iterations after a burn-in period of 10 5 . The STRUCTURE results were followed by Evanno method (Evanno et al. 2005), as peformed by STRUCTURE Harvester online tool (Earl and vonHoldt 2012). The groups identified by Evanno method were subjected to AMOVA analysis to reveal the genetic differentiation of these samples. This was done by AMOVA with 1000 permutations as performed in GenAlex 6.4 (Peakall and Smouse 2006). We also used multi-dimensional scaling (MDS) method to investigate genetic distinctness of these groups as performed in PAST version 2.17 (Hammer et al. 2012).
Rice samples studied were from 10 geographical regions of the country. We therefore, investigated thegenetic variability in these regions by estimating different genetic diversity parameters as determined in GenAlex 6.4 (Peakall & Smouse 2006). Moreover, the Mantel test (Podani 2000) was performed to study association between genetic distance and geographical distance of the studied populations.

RESULTS
Grouping of the rice genotypes studied by different clustering methods produced similar results, therefore, only Ward dendrogram is presented (Fig. 1).
In general four major clusters were formed. The first major cluster is comprised of two sub-clusters. Mostly cultivated rice genotypes form the first sub-cluster, while landraces and cultivars together comprised the second sub-cluster.
The other major clusters were also formed by mixture of landraces and cultivated rice genotypes. It is interesting to see that both landraces as well as rice cultivars were distributed in different clusters due to their genetic difference. This indicates the presence of genetic diversity in Iranian rice genotypes. STRUCTURE analysis of the studied genotypes followed by Evanno test produced the optimal number of genetic groups K = 2. STRUCTURE plot based on K = 2 ( Fig. 2), placed the studied genotypes into genetic groups. Therefore, based on both clustering and Bayesian  Table 1). approaches, the rice genotypes studied contain a good level of genetic diversity.
We then randomly selected some of the rice genotypes from the two genetic groups identified by STRUC-TURE for further analyses. AMOVA revealed significant genetic difference between the two groups (Phipt = 0.30. P = 0.01). The Fst value of 0.6 by STRUCTURE analysis also supported AMOVA in showing genetic difference of the genotypes.
MDS plot of these selected genotypes almost separated the two groups (Fig. 3), indicating their genetic difference. Moreover, spatial distribution of the genotypes within each group shows genetic variability within either groups. Therefore, we have both among group genetic difference, as well as with in group genetic variability.
The mean Nm = 13.6, for the studied genotypes indicates that a high degree of gene flow/ancestral common alleles are present in the rice genotypes studied.

Genetic variability and geography of the rice genotypes
The studied rice genotypes, were placed on 10 geographical groups (Table S1). Total number of SSR bands and private bands are provided in Table 1.
The highest number of SSR bands occurred in populations 1 and 2 (Mazandaran and Gilan, respectively). Most of the studied geographical regions contained private bands, with the highest number in populations 1, 2, and 8 (Mazandaran, Gilan, and Philipine, respectively).
These private SSR bands, are specific bands occurred during rice varieties genetic differentiation.
Genetic diversity analyses of these populations are presented in Table 2.
Two geographical regions of Mazandaran (Pop1), and Gilan (Pop2), contain the highest number of rice genotypes, as rice is mostly cultivated in northern Iran. These regions had the highest value for genetic polymorphism (71 and 60%, respectively), followed by Khuzestan (37%). However, the mean value for Neigene diversity, Shanon information index(I), and the number of effective alleles (Ne) were almost close to each other in most of the geographical populations. AMOVA produced significant difference among the studied geographical populations (PhiPT = 0.035, P = 0.02). It revealed that 3% of total genetic variance is due to among populations  genetic difference, while 97% is due to within population genetic variability. These results indicate that we have a great deal of genetic diversity both within and among geographical populations. Paired-sample AMOVA (Table  3), revealed that populations 4, 5, 7, 8, and 9, differed significantly with the others tudied populations.
In spite of significant Fst/Phi-st values between most of the geographic populations, these populations have a high genetic similarity (>0.92, Table 4). This is due to extensive common shared alleles within rice genotypes studied.
Mantel test with 999 permutations performed between genetic distance and geographical distance of the rice genotypes, produced significant association (r = 0.21, P = 0.01, Fig. 4).This result indicates that with increase in geographical distance of rice genotypes, they become genetically differentiated. Ward clustering was performed on the studied geography, after removing Philippine, unknown, and Ilam (single genotype) samples and combining to neighbor and closely placed provinces of Mazandaran and Gilans amples (Fig. 5).

Rice genotypes SSR barcoding
Based on allele frequency analysis, the following SSR alleles are specific in the studied rice genotypes. This SSR barcode scan be used in rice cultivars authentication (Table S5). CONCLUSION The present study revealed genetic variability both within and among rice varieties cultivated indifferentgeographical regions of Iran. These cultivars differed genetically from each other. Moreover, STRUCTURE analysis divided these cultivars and landraces in two major genetic groups. We observed an extensive degree of genetic admixture possibly due to gene flow and gene exchange among the studied rice genotypes. In a similar investigation, Wang et al. (2018), recognized, several geographical subpopulations and reported nucleotide polymorphisms, small indels and structural variations that result in within-and between-population variation. They also noticed a complex patterns of introgression in domestication genes.
We observed a high degree of genetic similarity ranging from 0.91 to 0.99 in the studied rice genotypes of the country. However, , used 58SSR markers for rice finger printing and noticed a moderate genetic polymorphism that ranged from 0.01 to 0.35 with the mean value = 0.23. They reported genetic similarity coefficient ranging from 0.65 to 0.92with the mean value = 0.314.
We noticed significant genetic difference between rice landraces and rice cultivars of the country. These Nei's genetic identity (above diagonal) and genetic distance (below diagonal). genetic difference may be used in future hybridization and breeding of rice in the country. The landrace rice genotypes may contain useful genes to be transferred to the popular rice cultivars. Leeet al. (2015) estimated genetic diversity in Korean rice landraces, by using SSR markers and reported the polymorphism information content (PIC) ranging from 0.11 to 0.93, and average observed heterozygosity ranging from 0.12 to 0.39. These landraces were divided in two major genetic groups by STRUCTURE analysis, while clustering divided them in three genetic groups. Landrace is a geographically or ecologically distinctive population, which differ genetically from each other and also differ from rice cultivars. Therefore, rice breeders must pay specific attention to these ecological variants which may new suitable traits for rice improvement.
These genotypes are shown to be excellent sources of genes for novel alleles (McCouch et al.1997;Jackson 1999;Guevarra et al. 2001). For example, Rao et al. (2018), used SSR markers for association mapping, and identified 12 genomic regions for yield and yield associated traits under low nitrogen. Somnath et al. (2016), studied the genetic structure of 64 hill rice landraces in India, by using microsatellite markers. These landrace genotypes were separated in two groups: umte (large-grained, late maturing) and tening (small-grained,early maturing).The kernel length and plant height were the main discriminatory characters between these cultivar groups. They showed high Genetic diversity within rice genotypes.
The genetic variability of rice landraces in Brazil was investigated by SSR markers (Borba et al. 2009). The study was performed in 417 landraces collected in 1986, 1987 and 2003. These researchers noticed that the number of landraces with long and thin grain type increased in the evaluated period, probably due to market demand. Moreover, the genetic variability increased during this period and that, most of the landraces were grouped according to the year of collection. Therefore, it was suggested that the selection performed by farmers are the most probable factor responsible for increasing landracegenetic variability, during the evaluated period.
AMOVA in present study revealed a higher degree of genetic variability within geographical populations (97%) and a lower degree, though significant difference, among the regions (3%). In a similar study by using SSRs, Rao et al. (2018), also reported 9.66% genetic variation among the subgroups and 90.34% of variation within these subgroups. Pusadeea et al. (2019) studied the population genetic structure of a single variety of landrace rice, BueCho- mee, cultivated by Karen people of Thailand by using SSR markers. They observed high level of genetic variation within the studied villages despite predominant inbreeding in this crop. BueChomee rice showed significant genetic differentiation among Karen villages for both molecular content and genetically determined traits such as flowering time.
Similarly, Sabori et al. (2008), compared Iranian rice genotypes, including landrace, improved cultivars, and few exotic cultivars for their salinity tolerance at seedling stage and to determine tolerance indices, based on biomass, genotypic code and Na+/K+ ratio. The characteristics studied were root and shoot length, root and shoot dry weight, Na+ and K+ concentrations, and thegenetic score of the genotypes. These characters showed the variable degree of heritability.
Genetic score under salinity stress showed that Tarom-mahalli, Gharib, Shahpasand Mazandaranand Ahlami-Tarom with more biological yield root and shoot lengths, and low Na+/K+ ratio were tolerant.
We observed significant positive association between genetic distance and geographic distance of the rice genotype studied and presence of an overall isolation by distance (IBD) model of differentiation across the geographical regions of Iran. A similar observation was reported by Pusadeea et al. (2019) while studying the population genetic structure of landrace rice, Bue-Chomee that is cultivated by Karen people. Therefore, they concluded that landraces serve as reservoirs of genetic variation which is influenced by natural processes such as selection and drift, and by the agriculture practices of local farmers. This is also supported by investigationperformed by Wang et al. (2016). They used SSR markers and reported that the genetic diversity parameters were significantly higher in landraces under on-farm conservation compared to those under ex-situ conservation, in 12 villages of Guizhou, Yunnan and Guangxi provinces of China.Therefore, rice landraces under on-farm conservation programs had a higher genetic diversity compared to that of ex-situ conservation. This is affected by local on farm cultivation and onservation practice.
Previous genetic studies in Iranian rice genotypes by different molecular markers also indicated significant difference among the studied genotypes, including landraces and the cultivars. For instance, Nasabi et al. (2012), studied the genetic diversity in 20 Iranians rice (Oryza sativa L.) varieties using SSR markers linked to the genes controlling drought tolerance. They observed significant differences between varieties and drought resistance index. They reported total number of alleles of 142 with an average of 7.47 allele per locus. The average value of PIC was0.817, and rice genotypes were divided into 6 groups.
Similarly, Abootalebi et al. (2014), used SSR markers in 50 rice genotypes. They reported significant genetic difference among these genotypes which were divided in two genetic groups by neighbor-net networking and STRUCTURE analysis. Moreover, Tabkhkar et al. (2012), studied the genetic diversity in 48 rice genotypes by using SSR molecular markers that were tightly linked to major QTLs controlling three major components of rice cooking and eating quality (i.e. amylose content, gelatinization temperature and gel consistency). They reported the presence of a good level of genetic diversity in rice genotypes studied and the mean Nei's gene diversity = 0.72. Cluster analysis divided the genotypes into four groups and separated the landrace cultivars with good cooking and eating quality (based on Iranian taste) from others.
In conclusion, the present investigation indicated genetic variability both within and among rice genotypes cultivated and grown indifferent geographical regions of Iran. Moreover, SSR loci that can differentiate rice genotypes are identified and can be used in rice cultivars authentication. NS collected the samples and performed the PCR tests, QIAXCEL data normalization and QIAxcelScreen-Gel® software analyses. ZN conceptualization of the project and data analyzed and interpreted. MSh data analyzed and interpreted. HRZ was co-advisor regards to plant biology. All authors were contributors in writing the manuscript and read and approved the finalmanuscript. also thank Iran Rice Research Institute of Iran (RRII), Iran Food and Drug Administration (IFAD), Iranian Rice Importers Association (IRIA)and the International Rice Research Institute (IRRI), Philippines for providing rice samples.