Muscular Coloration Diversity and Microsatellite Genetic Differentiation for the Clam Paphia amabilis (Philippi, 1847)

Paphia amabilis is an important fishery shellfish widely distributing along Southeast Asia coast. The muscles of its foot and siphons are commonly colored bright red-orange due to the presence of C37 skeletal carotenoids. However, 10.5%-18.6% of yellow-muscled individuals (YMs) were found among the four P. amabilis wild samples from Vietnam (Co To island, CT) and China (Beihai, BH/BHA, Guangxi Province; and Zhanjiang, ZJ, Guangdong Province), the north of Beibu Gulf, South China Sea. Twenty genomic microsatellite markers were developed and 12 were employed to analyze the population genetic differentiation of three geographical populations (CT, BH, and ZJ, n=32/population), as well as YM and RM (red-muscled) subpopulations. Medium levels of genetic diversity were detected in all three populations with the number of alleles ranging from 4-21 (8.4±4.5) for CT, 4-16 (7.6±3.3) for BH and 4-18 (7.5±3.7) for ZJ. The average observed/expected heterozygosities were 0.66/0.66 for CT, 0.62/0.63 for BH, and 0.55/0.62 for ZJ, respectively. Total 4 out of 36 tests of microsatellites deviated from Hardy–Weinberg equilibrium (HWE, P <0.05 after Bonferroni correction), all in ZJ population due to the presence of null alleles suggested by MICRO-CHECK. Exact test indicated highly significant differentiations ( F ST ) between Vietnamese and two Chinese populations (0.099/0.123 between CT and BH/ZJ). No allele of any microsatellite marker was found significantly associated with muscular coloration of P. amabilis , consequently no convergent microsatellite genetic differentiation presents between YM and RM subpopulations, supporting that epigenetic factor might play an important role in carotenoids accumulation of P. amabilis .

The color variance in marine bivalve is probably due to the differences in total carotenoid content (Zheng et al., 2010) caused by genetic factor or environmental factors (Zheng et al., 2012a;Borodina and Soldatov, 2016). Recent studies showed that there are significant differences in the qualitative composition and seasonal variation of carotenoids in shellfishes with similar habitats and filter feeding patterns (Borodina and Zadorozhny, 2020), indicate that genetic and metabolic characteristics play important roles in the concentration of carotenoids. In order to establish effective molecular markers for selective breeding of the red muscle strains, the distribution of muscle color in natural populations and in male and female, as well as the genetic differentiation of neighboring geographical populations and subpopulations of red/yellow muscle, should be explored. In our earlier efforts to develop genetic markers for conservation and aquaculture of this species, 12 microsatellite markers have been identified based on one wild population (Li et al., 2014). In this study, (1) we developed more genomic microsatellite markers and analyzed the genetic structure of three geographic populations, and the genetic differentiation of YMs and RMs subpopulations; (2) the distribution of different muscular coloration in the natural populations, as well as female and male subpopulations of P. amabilis in the north of Beibu Bay, were investigated. Our aim is to assist aquatic products breeding in the selection of parental founder population, and to explore the mode of inheritance of muscular coloration in P. amabilis.

Muscle coloration
The muscles of P. amabilis's feet and siphons mostly have bright red color (Figure 1-1; Table 1). The variety from 10.5% to 18.6% of YMs was, however, found in CT, ZJ and BH/BHA populations (Figure 1-3; Table 1), and, 5 clams (8.3%) with intermediate muscle colour (between red and yellow, Figure 1-2; Table 1) were observed only in the spring BH population. Figure 1 The P. amabilis with various colors of foot and siphon muscles Note: 1: Red muscles clam. 2: Intermediate color muscles clam. 3: Yellow muscles clam. f: foot, s: inhalant and exhalant siphons, am: adductor muscle; Scale bar, 10 mm Chi-Square tests revealed that no significant difference of the sex ratio (about 1:1) and the percentage of YMs among those four populations (p>0.05, Table 1). And, although the ratio of YM females was lower than that of YM males, but not significant in spring populations (BH,CT and ZJ,p>0.05), whereas significant in autumn BHA specimens (p<0.05, Table 1). Note: *The muscle color of 5 individuals in BH were between red and yellow ( Figure 1); # Data not tested. Values in bold represented significant differences at p<0.05

Microsatellite markers of P. amabilis
In total, twenty primer pairs amplified consistently microsatellites in the 3 individual DNA pools were selected for further population genotyping ( Table 2). The GenBank accession numbers, primer sequences, motifs, the optimized annealing temperatures, and total allele number based on all samples from three geographical populations were listed in Table 3. For the three populations, 19 out of 20 loci (excluding Pam5) were polymorphic with the total number of alleles ranging from 3 for Pam03 to 24 for Pam30 (Table 2). Since Pam23 and Pam38 were monomorphic in at least one population, and Pam33, Pam37 and Pam38 were only amplified in less than half clams, these 4 loci were excluded from further population genetics analysis (Table 2).
The global multilocus FST based on 12 loci was estimated to be 0.026 (p < 0.001, Table 4), indicating genetic differentiation among three populations. The pairwise FST value (-0.002) and Nei's genetic distance (0.067) were low between BH and ZJ populations (Table 4), but high between CT and the two Chinese populations (FST, 0.036/0.032 for BH /ZJ, and Nei's genetic distance, 0.099/0.123 for BH/ZJ). The exact tests showed the same genetic pattern, suggesting significantly genetic differences between CT and BH/ZJ populations ( Table 4). The consequent Neighbour-joining dendrogram (Figure 2), constructed based on Nei's DA genetic distance (Nei et al., 1983), recognized two clusters. The two Chinese populations (BH and ZJ) clustered together to form cluster I, and the Vietnamese CT population separately formed cluster II. The Nei's genetic distance matix for muscular coloration YM groups (coded as ZJy, BHy and CTy, respectively) and RM groups (coded as ZJr, BHr and CTr, respectively, Table 5) of three geographic populations revealed that, CTy kept the largest genetic distance between all other groups (from 0.216 for CTr to 0.331 for BHy), while the smallest genetic distances were found between BHr and ZJr (0.076) /CTr (0.108). The Neighbour-joining phylogenetic tree for those muscular coloration groups ( Figure 3) indicated that YM (ZJy, BHy) and RM (ZJr, BHr) groups from Chinese populations do not cluster any clade characterized by geographic population or muscular coloration, although the CTy and CTr groups of Vietnamese population remain outgroups due to geographic isolation. Consistently, paired FST ( PCoA provided a visual representation of genetic similarity between six muscular coloration groups of three populations (Figure 4). The first and second principal coordinates explained 91.51% and 7.60% of the total variation, respecitvely. The two groups from BH (BHy and BHr) and the two groups from ZJ (ZJr and ZJr) were mainly spread over a large proportion of coord. 2, respectively, and interlacing with each other ( Figure 5). For the two groups from CT population (CTr and CTy), they spread over a large proportion of the right side of coord. 1 in other two-dimensional space, while CTy on the far side away from Chinese color-subpopulations. This result was consistent with the Neighbour-joining tree and the genetic differentiation data shown in Table 4.
No association was found between the frequencies of the genotypes and alleles of the 32 microsatellites (20 in this study, and other 12 developed previously, Li et al., 2014) and muscular color (p>0.05 with Bonferroni's correction).

Discussion
General causes for microsatellites markers deviated from HWE include presence of substructures of the sampled population, inbreeding, or the presence of null alleles (Zouros and Foltz, 1984). In this study, null alleles seem to play the major role and 8 out of 11 tests can be explained by the presence of null alleles. Such deviations appear to be quite common in microsatellite markers developed for bivalves (e.g., Hedgecock et al., 2004;Wang et al., 2010). High frequency of null alleles may misestimate many types of population genetic analyses that rely on HWE (Waples, 2014), thus we exclude those loci deviated from HWE in more than one population for population genetic analyses. In addition, in ZJ population more loci were significantly deviated from HWE, probably because the population was located in the northeast corner of Beibu Gulf and the poor flow exchange restricted its panmixia, thus forming facility substrates. The level of genetic diversity of population largely depends on the mating system of a species, and the population evolutionary history, and on the level of environmental heterogeneity (Booy et al., 2000). It is critical for us to discover the genetic diversity level of P. amabilis in order to ensure the sustainable use of this aquatic resource. In this study, we found medium levels of genetic diversity in three P. amabilis natural populations in the north of Beibu Bay, South China Sea. The level of diversity was similar to that of other in vitro fertilization mollusks who also naturally live in this region, e.g., hard clam Meretrix meretrix (Ye et al., 2020), pearl oyster Pinctada martensii (Qiu et al., 2013), silver-lipped pear oyster Pinctada maxima (Wu et al., 2014) and moon scallop Amusium pleuronectes (Mahidol et al., 2007;Su et al., 2015). For three geographic populations in this study, no significant difference of diversity level was observed between any population pair, even though the highest observed values were found in Vietnamese population, which possibly because there is less fishing pressure in Co To Island. In addition, the habitat of P. amabilis in Co To Island is mud-sandy sediment, so the environmental conditions are more complex than that of the sandy sediment of Beihai and Zhanjiang.
The average inbreeding coefficient FIS based on all 12 loci were not high (-0.02 for CT to 0.12 for ZJ), suggesting that overall, the loss of heterozygosity in the three groups was not obvious, and the genetic composition of the population was normal and healthy (Carlsson et al., 2006). This study investigated the genetic population structure of the P. amabilis along the northern Beibu Bay coastline, covering a distance of nearly 400 km. The global FST estimate (0.026) and pairwise FST between CT and BJ/ZJ, revealed significant genetic variation between those populations, indicating the gene flow was limited at a large geographical scale (142/199 km, between CT and BH/ZJ), while it was strong at a small geographical scale (75 km, between BH and ZJ). Superficially, the genetic structure of P. amabilis was consistent with a model of isolation by distance, where all populations are connected by continuous migration but the rate of gene flow is greatest between neighbouring populations (Arnaud et al., 2000). In fact, no significant positive correlation between genetic differentiation (FST/genetic distance) and geographical distance was observed (Mantel test: p=0.49/0.15). That could be attributed to the gene flow between population BH and ZJ by not only the planktotrophic larvae migration but also human commercial transportation.
Patterns of larval dispersal and connectivity among marine invertebrate population are often estimated from life-history characteristics and the oceanographic conditions (Colson and Hughes, 2004;Cowen et al., 2006). The limited larval dispersal ability of the P. amabilis results from several life-history characteristics of this species, such as short planktonic larva stage (only 14~18 days, Li et al., 2012) and short spawning period (from April to May and from September to October, in total less than two months, Li et al., 2013). Furthermore, in the middle of the productive season of P. amabilis, Beibu Gulf currents alternate its direction (Yang et al., 2003), thus likely affecting the migration of P. amabilis larvae. In addition, the direction of coastal current near western Guangdong turns to the northeast (Yang et al., 2003) when the southwest summer monsoon becomes strong, which might promote the migration of P. amabilis larvae between ZJ and BH. Besides, larval settlement and metamorphosis of P. amabilis were influenced by many environmental factors, such as temperature and salinity (Li et al., 2012). All of which must affect the population dynamics of this species. Cowen et al. (2006) reported that typical larval dispersal distances for a variety of reef fish species are on the scale of only 10 to 100 kilometres using a high-resolution biophysical model for the Caribbean region, this distance scale is consistent with that of P. amabilis larval dispersal detected in this study.
Carotenoids are powerful antioxidants that affect many physiological functions (Soldatov et al., 2017), so they are essential nutrients for animals and humans (Zheng et al. 2012a). More than one-third of carotenoids found in nature are of marine origin (Maoka, 2011) and marine mollusks are receiving increasing attentions as important producers of natural carotenoids (Zhang et al., 2014). The muscles of P. amabilis' foot and siphons are colored bright red because they contain a special C37 skeletal carotenoids (Maoka et al., 2008), making this clam not only has an attractive appearance but also be considered as healthy food. Female bivalves have greater carotene enrichment during either rapid growth stage or gonadal maturation (Zheng et al., 2012b;Borodina, 2018), and individuals with higher total carotenoid content (TCC) or carotenoid enrichment show greater resistance to stress (e.g., hypoxia, Zheng et al., 2012b;Gostyukhina and Andreenko, 2020). These are accordant with our findings in this study that, the percentages of YMs varied in four populations and in male/female groups, and the ratio of YM females is significantly lower than that of male only in autumn instead of spring samples, and, RMs showed stronger hypoxic tolerance during transportation. The structures of RMs' foot might facilitate the exchange of gases, as in similar cases in the Black Sea invasive bivalve species Anadara kagoshimensis (Tokunaga, 1906), who is highly adaptable to changing environments and its foot not only ensures its active lifestyle but also has some physiological and metabolic features (Shulman, 2014).
In our earlier efforts to develop genetic markers for conservation and aquaculture of this clam, we found that Pam13, a microsatellite marker showing evidence of null allele, had significantly higher observed heterozygosity and higher frequency of allele 308 in YMs than in RMs (7 YMs and 25 RMs, one population, Li et al., 2014). However, in this study, this distinctness disappeared when it was genotyped with other geographic populations and more samples. It is speculated that this accidental association with muscle colour may be due in part to the presence of null allele. Overall, no specific alleles or genotype of total 32 microsatellite marker (20 in this study, and other 12 developed previously, Li et al., 2014) were found associated with muscular coloration, and no stable convergence of genetic differentiations were detected in the same color-muscle P. amabilis. Wang et al. (2015) identified a gene lysophosphatidylcholine acyltransferase 1 (LPCAT1) that is potentially crucial for carotenoid accumulation in Yesso scallop (Patinopecten yessoensis) adductor muscle. This gene shows significantly higher expression and lower methylation level at the 5' end of the gene body in orange adductor muscle individual than in white ones. This epigenetic switch that controls accumulation of carotenoids in scallop adductor muscle may explain that carotenoid accumulation naturally occurs in female gonads of Yesso scallops , and implied that the strength of carotenoid accumulation maybe associated with the DNA methylation level. This might also be the reason why in our study we found 5 (8.3%) individuals with intermediate muscle color in BH population on spring. Probably, in P. amabilis there are individual/population differences in carotenoid accumulation and metabolic intensity, which are simultaneously affected by gene expression and environmental factors (e.g., the qualitative composition and annual pattern of succession of diatomic microalgae, Borodina and Soldatov 2016), or even by the gut microbiota (Liu et al., 2020).
This study provided the first molecular evidence of genetic divergence of P. amabilis in South Chinese Sea, which is helpful for the effective conservation and management of the species. These results will also be useful for the seeding production and for the genetic improvement of this clam. The significant genetic differentiation between the Chinese and Vietnamese populations implied the probability to develop particular strains for the benefit of aquaculture. In addition, our results also suggest that P. amabilis could be considered as an important animal model for studying the ways epigenetics and the environmental factors in controlling carotenoids accumulation in sea animals.

Materials and Methods
3.1 Animals and measure 57, 43 and 60 P. amabilis clams were collected in Co To Island, Vietnam (Coded for CT), the west of Zhanjiang, Guangdong province (ZJ) and Beihai, Guangxi province (BH) in China, in March 2013, respectively ( Figure 5). Another 60 clams were sampled in Beihai in Oct. 2013 (BHA). Shell length, shell height, and shell width were measured for each clam with vernier caliper. The whole weight and shell weight were measured by electronic balance, the gonadal tissues were observed under a microscope to determine the sex, and the foot and siphon's color were recorded. The number of individuals, sex ratio, and muscle color differentiation of each population were listed in Table 1. Finally, 32 individuals were randomly selected for further genetic analysis from each population of CT, ZJ, and BH+BHA.

Isolation of microsatellites and primer validation
Microsatellite markers were developed from an enriched genomic library deriving from 6 clams' DNA. The genomic library enriched for CA-repeats was constructed using 200 ng DNA digested with RsaI (Bio-Red), a biotin labeled (CA)10 probe and a magnetic field . Clones with inserts between 400 and 1000 bp were sequenced in both directions using an ABI3730 sequencer. After removing the sequences of linkers and T-vectors, DNA sequences were searched against each other (no other microsatellite sequence of P. amabilis was found in GenBank till 31 st Jan. 2021) using Vector NTI Advance 11.0.0 (http://www.invitrogen.com) to check for duplicates. MISA software (http://pgrc.ipk-gatersleben.de/misa/) was used to screen for sequences containing at least 6 di-, 5 tri-, 5 tetra-, 4 penta-, and 3 hexa-, hepta-and octa-nucleotide repeats. Qualified sequences with sufficient flanking regions were selected for primer design with Primer 3 (http://biotools.umassmed.edu/bioapps/primer3_www.cgi) . Of the 60 sequenced clones, 48 yielded microsatellite sequences for primer design. The M13 (-21) universal leading sequence (5'-TGTAAAACGACGGCCAGT-3') was added to the 5' end of each forward primer for economic fluorescent labelling of PCR fragments (Schuelke, 2000). Primer validation was performed following Yu et al. (2015), using 3 DNA pools, Premix Taq TM (TaKaRa), and an Eppendorf Master Cycler Gradient (Germany). Primer pairs amplified consistently in 3 DNA pools were selected for further genetic analysis (Table 2).

PCR conditions and genotyping
In total 96 clams from three geographic populations (CT, ZJ, and BH) were screened for variation by the selected microsatellite loci. PCR was conducted in a 10 μL solution contained~50 ng template DNA and 1×Premix Taq TM (TaKaRa), and 0.2 pmol M13 tailed forward primer, 0.8 pmol reverse primer, 0.6 pmol M13 (-21) primer labelled with fluorescent dyes (FAM, VIC, NED, and PET; Applied Biosystems) . PCR was conducted on an Eppendorf Master Cycler (Germany). Cycling parameters were 94℃ for 5 min followed by 30 cycles of 94℃ for 30 s, annealing (temperatures indicated in Table 3) for 45 s, 72℃ for 45 s, followed by 10 cycles of 94℃ for 30 s, 53℃ for 45 s, 72℃ for 45 s, and a final extension at 72℃ for 10 min. Amplified products were detected and sized on an ABI 3730 Prism Genetic Analyzer (Applied Biosystems) using GS-500LIZ as size standard. Allele scoring was performed with GeneMapper v3.7 (Yu et al., 2015). Samples failing to amplify the first time were reamplified at least twice and we re-designed the primer pairs of 5 loci, which improved the amplification rate (Pam38) and avoided the multiple-loci amplification in two loci (Pam11 and Pam24).

Statistical analysis
GENEPOP on the web (http://genepop.curtin.edu.au/) was used to test for deviations from Hardy-Weinberg equilibrium (HWE) for each locus, and for genotypic linkage disequilibrium (LD, 1000 iterations) between all pairs of loci (exact tests, 1000 iterations), as well as to calculate the FST for all populations and population pairs (10 100 permutations, Yu et al., 2015). Exact tests of population differentiation (Raymond and Rousset, 1995) and Mantel test (matrix of pairwise FST/genetic distance and geographical distances, the shortest distance by sea between two populations) were also used GENEPOP on web. The ARLEQUIN 3.0 software (Excoffier et al., 2005) was used to calculate observed (HO) and expected (HE) heterozygosities. The MICRO-CHECKER 2.2.1 software (van Oosterhout et al. 2004) was used for identifying possible genotyping errors and null alleles (1000 randomizations). FSTAT 2.9.3.2 (Goudet, 1995) was used to calculate allelic richness and FIS per locus and sample. All tests were corrected for multiple comparisons by Bonferroni's correction (Rice, 1989). Poptree 2 (Takezaki et al., 2010) was used to construct the Neighbour-joining trees based on the Nei's DA genetic distances (Nei et al., 1983). GENALEX 6 (Peakall and Smouse, 2006) was used to do AMOVA (Analysis of Molecular Variance, Excoffier et al., 1992), estimating and comparing the genetic differentiation of muscular coloration groups among each geographic population.
In order to compare and visualize the genetic differentiations of RMs and Yms group in three geographical populations, each population was divided into two groups by muscular coloration. Neighbour-joining tree for muscular coloration groups was constructed and paired FST were calculated as aforesaid. A principal coordinate analysis (PCoA) was performed to display any pattern in the group genetic distances, using GENALEX 6 (Peakall and Smouse, 2006).
Chi-square tests were used to estimate the significances for the association of the frequencies of the genotypes and alleles of the microsatellites, as well as the sex ratio between RMs and YMs.
Authors' contributions SW, XW and FY contributed equally to this study. YW and BL were responsible for conceptualization, funding acquisition and resources. SW, XW, FY, TL and QL were responsible for experimental investigation. SW, XW and FY were mainly responsible for data processing and picture modification. YW and BL were responsible for the draft preparation. YW proofread the final manuscript before submission. All authors read and approved the final manuscript.