Introduction

Investigating biodiversity, population size, and time-course changes associated with environmental change is important for the conservation of biodiversity. Conventionally, population monitoring in the natural environment has required experts to classify and count species. Although the direct monitoring of species is a reliable approach, it is laborious and relies on intense monitoring and sampling efforts. Moreover, certain species, such as nocturnal species, are difficult to investigate, and direct monitoring is sometimes invasive to the environment and individual organisms.

Environmental DNA (eDNA) is a mixture of DNA released into the environment from many different species in the form of mucus, saliva, faeces, urine, gametes, and skin1,2,3. The use of eDNA offers the potential to monitor species present in nature without sampling effort or visual identification expertise. In the past decade, eDNA has been used to investigate the biodiversity of various species including fish, plants, fungi, birds, and mammals4. The applications of eDNA analysis are not limited to the present environment, but also include studies of past biodiversity in ice cores and sediments5,6.

For ecological community eDNA analysis, marker genes (e.g. mtDNA and nuclear rRNA genes) used for taxonomic identification are amplified by PCR followed by high-throughput sequencing (HTS). Species abundances in the community are often determined according to the relative abundance of the sequences of each taxon in a sequence library. However, the quantification accuracy of these methods is compromised by various factors that can affect the PCR efficiency, such as the amplified DNA sequences (e.g. GC content or the base adjacent to primers) and primer sequences7,8,9,10.

In recent years, there have been growing expectations for the application of environmental DNA data from each species in the natural environment to estimate species biomass/abundance and monitor the effects of environmental destruction. To apply environmental DNA to the monitoring of species in the natural environment, it is very important to understand the correlation between quantitative environmental DNA data and species abundance. DNA quantification has been conventionally performed by quantitative PCR (qPCR)11,12 and more recently by digital PCR (dPCR), which is more accurate than the former because of tolerance to PCR inhibitory substances13. Quantification by qPCR or dPCR requires an assay to be established for each target, including the design of PCR primers, preparation of standards (if qPCR is used), and optimisation of PCR conditions. Therefore, parallel quantitative analysis of numerous species is not straightforward, although it has been recently attempted14.

The recently developed quantitative sequencing (qSeq) technique15,16 enables simultaneous sequencing and quantification of DNA from many species in a single HTS run (Fig. 1). In qSeq analysis, a random sequence tag is added to the 5′ end of the target sequence during single primer extension (SPE) prior to PCR amplification to prepare the sequence library. If the variety of random tag sequences is sufficiently large relative to the number of targeting DNA molecules, the distribution of the random tag to DNA molecules follows the Poisson statistic17. Therefore, after HTS, the number of DNA molecules in a sample can be estimated by counting the variety of random tags at the 5′ end of the targeted sequence with minimal effect from PCR bias.

Figure 1
figure 1

Schematic of quantitative sequencing (qSeq) procedure.

In this study, we applied qSeq to eDNA extracted from aquarium experiments comprising five fish species. The qSeq results were compared with the quantitative values obtained by dPCR to demonstrate the potential of qSeq for the comprehensive quantitative analysis of eDNA from natural environments.

Methods

Aquarium experiment and sampling

To examine the effect of changes in species composition on the behaviour of eDNA, we conducted aquarium experiments using two mock fish communities comprising H. neglectus, C. temminckii, O. latipes, R. flumineus, and M. anguillicaudatus. Mock community 1 (MC1) consisted of one individual of each of the five fish species, whereas mock community 2 (MC2) consisted of three H. neglectus individuals and one individual of each of the other four fish species (Fig. 2). We used two aquaria (A and B). Each aquarium was used four times, twice for each mock community, giving two replicates (R1 and R2). This resulted in eight experimental units (2 mock fish communities × 2 aquaria × 2 replicates). Figure 2 shows the experimental setup used in this study.

Figure 2
figure 2

Experimental setup of the aquarium experiments.

To set up the aquaria, 20 L of tap water was added into each aquarium (GEX Co. Ltd., Osaka, Japan) and heated with a heater (Spectrum Brands, Wisconsin, US) until the water temperature reached 25 °C. Water in the two aquaria was maintained at 25 °C and constantly circulated with an aeration device. Before adding fish to the aquaria, the water was sampled for the negative control. The first experimental samples (day 0) were taken 1 h after adding the fish and subsequent samples were taken each day until day 4. At each sampling, two 1-L samples of surface water were collected from each aquarium and then 2 L of tap water was added to each aquarium to maintain the volume of water. The weight of individual fish species was measured using an electronic balance immediately after the final sampling. After each experiment, the two aquaria were bleached before being reused.

In Japan, experiments on fish do not require any legal procedures or permission. However, in order to avoid causing pain to the specimens, the experiments in this study were conducted in accordance with the ARRIVE guidelines, Japanese laws and guidelines for mammals, birds, and reptiles as below; Act on Welfare and Management of Animals (Notice of the Ministry of the Environment No. 105 of October 1, 1973), Standards relating to the Care and Keeping and Reducing Pain of Laboratory Animals (Notice of the Ministry of the Environment No. 88 of 2006), Fundamental Guidelines for Proper Conduct of Animal Experiment and Related Activities in Academic Research Institutions under the jurisdiction of the Ministry of Education (Notice of Ministry of Education No. 71, 2006), and Guidelines for Proper Conduct of Animal Experiments (established by the Science Council of Japan on June 1, 2006).

DNA extraction

Each 1-L water sample was filtered immediately through a GF/F glass fibre filter (nominal pore size = 0.7 μm, diameter = 47 mm; GE Healthcare Japan Corporation, Tokyo, Japan). Filter funnels and measuring cups were bleached after filtration to prevent cross-contamination among the water samples. All filters were stored separately at − 20 °C until DNA extraction. Total eDNA was extracted from each filter using a DNeasy Blood and Tissue Kit (QIAGEN, Hilden, Germany) and Salivette tubes (Sarstedt AG & Co. KG, Nümbrecht, Germany). Extraction methods were as previously described18 with modifications. A filter sample was placed in the upper part of the Salivette tube and 220 μL of solution containing Buffer AL (200 μL) and Proteinase K (20 μL) was added. The tube containing the filter was incubated at 56 °C for 30 min, then centrifuged at 5000 × g for 3 min, and the solution was collected in the base of the tube. To increase eDNA yield, 220 μL Tris-EDTA (TE) buffer was added to the filter sample and centrifuged at 5000 × g for 1 min. Then, ethanol (200 μL) was added to the collected solution, and the mixture was transferred to a spin column. Total eDNA was eluted in buffer AE (100 μL), following the manufacturer’s instructions. All eDNA samples were stored at − 20 °C prior to qSeq and dPCR.

Quantitative sequencing

Simultaneous quantification and sequencing of the extracted eDNA were performed by qSeq as previously described15,16. First, SPE was performed. The SPE reaction mixture (20 µL) consisted of 1 × PrimeSTAR Max premix (Takara Bio Inc., Kusatsu, Japan), 300 nM of the primer qSeq-MiFish-U-F (Table 1), and extracted DNA (2 µL). The SPE primer qSeq-MiFish-U-F contains an eight-base length random sequence tag, which creates 65,536 different variations, enabling the quantification of up to approximately 1.0 × 105 copies of DNA15. This amount of variation was sufficient to quantify the abundance of eDNA in this study. SPE was initiated by denaturation at 94 °C for 1 min, followed by cooling to 60 °C at 0.3 °C/s, incubation at 60 °C for 1 min, and final extension at 70 °C for 10 min. Subsequently, the excess primer was completely digested by adding exonuclease I (4 µL, 5 U/µL; Takara Bio Inc.) to the SPE mixture. The digestion was performed at 37 °C for 120 min, followed by inactivation of the exonuclease I at 80 °C for 30 min. The first-round PCR mixture (25 µL) contained PrimeSTAR Max premix (12.5 µL), primers qSeq-MiFish-U-R and F2 (300 nM each; Table 1), and the SPE product (2 µL). Following 40 cycles of amplification at 98 °C for 10 s, 55 °C for 5 s, and 72 °C for 5 s, the amplification product was subjected to agarose gel electrophoresis, and the band of the expected size was removed and purified using Nucleospin Gel and PCR Clean-up column (Takara Bio Inc.). The qSeq-MiFish-U-R primer also contains eight N bases to increase the complexity, which improves the sequencing quality, and thus PhiX was not added in this study. Finally, a 2nd-round PCR was performed to add an index for Illumina sequencing as described elsewhere15. The indexed PCR amplicon was purified using AMPure XP beads (Beckman Coulter, Indianapolis, IN) followed by sequencing using a MiSeq platform with MiSeq Reagent Kit v3 for 600 cycles (Illumina). The sequence data obtained in this study were deposited in the DDBJ database under accession numbers SAMD00219124–SAMD00219214.

Table 1 Oligonucleotide sequences used in this study.

Data analysis

First, all sequences were assembled and screened by length and quality of reads using the mothur software package (v1.39.5)22. The processed sequence reads were classified using the MiFish pipeline (http://mitofish.aori.u-tokyo.ac.jp/mifish/), with the parameters as previously described23. Subsequently, the representative sequences of individual operational taxonomic units (OTUs) were extracted using the Usearch program (http://www.drive5.com/usearch/). The random sequence tags (RST) at the end of sequences in the OTUs were counted to quantify the environmental DNA from each fish species as described elsewhere16. For comparison, the relative proportion of eDNA from individual species in each sample was calculated from the composition of the sequences of the fish species obtained by qSeq.

Microfluidic digital PCR

Quantification of eDNA was also performed by microfluidic dPCR using the BioMark Real-time System and 12.765 Digital Array (Fluidigm Corporation, South San Francisco, CA, United States) as previously described13. For each sample, the PCR mixture (6 µL) contained 2 × Probe qPCR mix (3.0 µL; Takara Bio Inc.), 20 × binding dye sample loading reagent (0.6 µL; Fluidigm Corporation), forward and reverse primers (900 nM), TaqMan probe (125 nM), ROX solution (0.015 µL), and sample DNA (1.0 µL). We used three sets of primers and probes to quantify the eDNA of H. neglectus, O. latipes, and M. anguillicaudatus (Table 1). PCR was initiated at 98 °C for 2 min, followed by 50 cycles of 98 °C for 10 s and 60 °C for 1 min. The amplification curves obtained from individual reaction chambers of the microfluidic chip were analysed using Fluidigm Digital PCR analysis software (Fluidigm Corporation) to obtain abundance of DNA molecules.

Statistical analysis

We employed Gaussian Type II regression models with the standardised major axis method to determine the relationship between the log10 eDNA abundances obtained from qSeq and dPCR analyses with the “sma” function of the “smatr” ver. 3.4.8 package in R ver. 3.6.024. Zero values were disregarded for the modelling. We employed the Gaussian Type II model because our preliminary evaluation showed higher R2 values for Type II regression models with a Gaussian distribution than for those with a logarithmic distribution in all cases. We compared the differences in the coefficient values by overlapping the 95% confidence interval (CI) ranges.

Results

In this study, the abundance of eDNA (i.e. the fish mitochondrial 12S rRNA gene copy number) in the extracted DNA from the aquarium experiments using two mock fish communities was quantified by dPCR and qSeq. We used qPCR to quantify the eDNA of only three fish species, Hemigrammocypris neglectus, Misgurnus anguillicaudatus, and Oryzias latipes, whereas qSeq was used to quantify the eDNA of all five fish species used in this study. We also calculated the relative abundances of the sequences of each fish species in the sequence library obtained by HTS using MiFish primers for comparison with the dPCR and qSeq results.

For MC1, both dPCR and qSeq showed a trend of decreasing abundance of fish eDNA with time. The quantified values by dPCR were 5.0 × 103–7.2 × 105 copies, 1.0 × 103–3.0 × 105 copies, and 4.0 × 103–5.9 × 105 copies/L for O. latipes, H. neglectus, and M. anguillicaudatus, respectively (Fig. 3, left, top row). The abundance of eDNA quantified by qSeq ranged from 9.8 × 102–5.4 × 105 copies, 8.2 × 102–5.8 × 105 copies, and 8.3 × 102–2.5 × 105 copies/L for O. latipes, H. neglectus, and M. anguillicaudatus, respectively (Fig. 3, left, middle row). In addition to these three fish species whose eDNA could be quantified using previously established assays (i.e. with specific primers and probes), qSeq can quantify the eDNA of two further fish species, Candidia temminckii and Rhinogobius flumineus, without establishing species-specific assays. The latter two species showed eDNA abundances within the range of the other three species, and a similar trend of decreasing eDNA over time.

Figure 3
figure 3

Quantification of eDNA released from 5 fish species in the aquarium experiments. The 12 plots on the left show results from the aquaria using mock fish community 1 while the 12 plots on the right show the results from mock fish community 2.

MC2 included three H. neglectus individuals compared with one in MC1. In MC2, the eDNA abundance was consistently highest on day 1 and, the abundance tended to decrease over time after day 1. The quantified values by dPCR were 1.0 × 103–1.7 × 105 copies, 3.0 × 103–3.5 × 105 copies, and 1.0 × 103–1.1 × 105 copies/L for O. latipes, H. neglectus, and M. anguillicaudatus, respectively (Fig. 3, right, top row). The time-series profiles of eDNA abundance obtained by qSeq are consisted with dPCR (Fig. 3, right, middle row) and ranged from 2.1 × 103–6.9 × 104, 1.6 × 103–2.9 × 105, and 2.1 × 103–6.9 × 104 for O. latipes, H. neglectus, and M. anguillicaudatus, respectively. The other two species (C. temminckii and R. flumineus), which were only quantified using qSeq, showed a similar trend, with eDNA abundances peaking on day 2 and then decreasing over time.

Although the relative proportions obtained in MiFish are not comparable with the absolute quantitative values obtained by dPCR or qSeq, the relative proportions in the MiFish sequence library (Fig. 3, bottom row) followed different trends to the results from the two quantitative methods. In MC1, the MiFish results were similar to the dPCR and qSeq results for AR1 and AR2. However, BR1 and BR2 generated completely different MiFish results; the relative abundance of eDNA increased with time for three of the five fish. More specifically, in MC1BR2, the relative abundance of H. neglectus eDNA was 1% on day 0, and increased to 4–11% by day 4. The relative abundances of O. latipes and M. anguillicaudatus in the sequence library also increased over time. However, these results were concurrent with a decrease in the relative abundance of C. temminckii eDNA, and this might create a false impression that the eDNA abundances of other fish species increased over time. Furthermore, the relative abundances in the sequence library were not consistent with the results of the two quantitative methods. For instance, in MC2AR2, the lowest relative abundance was observed on day 2 and the trend in relative abundance of O. latipes indicated by the MiFish results was opposite to that inferred with the dPCR and qSeq methods. These results confirm that relative abundances alone cannot be used to quantitatively discuss the abundance of eDNA.

Previous studies have demonstrated a correlation between biomass and the density of eDNA in natural or laboratory environments25,26. In this study, however, no significant correlation was observed between biomass and abundance of eDNA (Fig. 4). The mean weight of M. anguillicaudatus was 4.2 g, which was approximately tenfold higher than that of O. latipes at 0.47 g (Table 2). However, the abundance of eDNA from M. anguillicaudatus was generally lower than that from O. latipes, regardless of the quantification method (Fig. 3). This discrepancy might be attributed to differences in the discharge rate of eDNA between fish species. MC2 contained three individuals of H. neglectus compared with one in MC1; however, the abundance of H. neglectus eDNA was lower in MC2 than in MC1 (Fig. 4).

Figure 4
figure 4

Correlation between biomass (fish weight) and the abundance of eDNA at all five time points quantified by qSeq.

Table 2 Composition and weight of the species used in the aquaria.

The abundance of eDNA from each fish species quantified by dPCR was strongly correlated with that quantified using qSeq (Fig. 5). The correlations were significant (P < 0.001), with R2 values of 0.643, 0.859, and 0.786 for H. neglectus, M. anguillicaudatus, and O. latipes, respectively. The relationships between qSeq and dPCR results had slopes of ~ 1 and were not significantly different to 95% CIs. However, the O. latipes eDNA abundance values were higher with dPCR than with qSeq for most of the samples. The clear significant linear correlation between quantified abundances obtained by dPCR and qSeq indicates that using qSeq instead of the standard HTS can add quantitative information to species composition data based on obtained sequences without establishing a specific assay for each fish species of interest.

Figure 5
figure 5

Correlation between eDNA copy number as quantified by digital PCR (dPCR) and quantitative sequencing (qSeq). Lines indicate regressions for each species, where pink is H. neglectus: [log(qSeq) = 1.052(dPCR) − 0.043, R2 = 0.643, P < 0.0001, Gaussian Type II regression]; grey is M. anguillicaudatus: [log(qSeq) = 1.114(dPCR) − 0.112, R2 = 0.859, P < 0.0001]; and orange is O. latipes: [log(qSeq) = 1.074(dPCR) − 0.058, R2 = 0.786, P < 0.0001].

Discussion

In the present study, we employed the qSeq technique to quantify eDNA in mesocosm experiments, demonstrating that eDNA from 5 fish species can be sequenced and quantified simultaneously by qSeq. The values quantified by dPCR tended to be slightly higher than those obtained by quantitative sequencing (qSeq). These results might be due to non-specific binding of the probes and subsequent digestion, resulting in overestimation by dPCR. Alternatively, the qSeq-MiFish-U-F primer might not have hybridised to all target eDNA during single primer extension, causing underestimation by qSeq. Since the amplification efficiency of PCR is generally around 80–90%27, depending on the DNA synthase and amplicon length, this may have led to a slight underestimation of the efficiency of the reaction, even though, in SPE, the reaction mixture was gradually cooled from 94 °C to the annealing temperature (60 °C) over about 5 min and then annealed for 1 min to increase the annealing efficiency. Compared to dPCR, where DNA can be quantified whenever amplification is confirmed in 50 cycles of PCR, quantitative sequencing, where single-stranded DNA must be synthesised in the SPE reaction for quantification, may be more influenced by differences in PCR efficiency.

Among the three fish species for which dPCR and qSeq were compared, the abundance obtained with qSeq was lower than that obtained with dPCR for O. latipes, with most of the plots falling below the 1:1 line (Fig. 5). The mtDNA sequence of O. latipes has two mismatches to qSeq-MiFish-U-F, may have resulted in lower abundance obtained by qSeq. The primer was designed to be universal for all fishes; however, the amplification of O. latipes eDNA may not have been optimal owing to these mismatches. Therefore, improvements such as the introduction of degenerate bases into the primer would be necessary for accurate quantification of O. latipes eDNA. In addition, increased stringency of the SPE reactions may be necessary to prevent off-target sequence amplification. Since increasing the annealing temperature is likely to reduce the efficiency of the reaction, it may be preferable to use artificial nucleic acids such as LNA and PNA to increase the annealing specificity28. Despite not being the target of the primers, O. latipes was detected in both qSeq and the conventional sequencing. The sensitivity this off-target sequence is thought to be higher with qSeq than with the conventional sequencing. This is because in the conventional PCR the regions complementary to the primers in PCR-replicated DNA would not have a mismatch, and so the effect of the primer sequence mismatch diminishes as the PCR proceeds, whereas in qSeq, DNA would not be quantified if the SPE do not occur due to the mismatch. Related to this, conventional qPCR may be more sensitive than qSeq when the number of target DNA is small15, as it may be difficult to hybridise all target DNA molecules in one step of SPE.

We found that the abundance of fish eDNA in aquaria was highest just after the fish were added and tended to decrease over time. This is consistent with previous studies that detected the highest abundance of fish eDNA at the start of experiments, and attributed this to the struggles during the acclimatisation of introduced fish to the environment11,29,30,31,32. However, we found that different mock fish communities had different peaks in eDNA abundance, with MC1 peaking on day 0, and MC2 peaking on day 1. The reason for this is not known, indicating that further studies of eDNA and fish behaviour are needed. The abundances of fish species obtained using HTS were very different from the absolute copy numbers of eDNA quantified with dPCR and qSeq. In general, PCR efficiency can affect the proportions of sequences in sequence libraries. A range of factors known to affect PCR efficiency, such as primer-target mismatches, sequence composition (e.g. GC content), and the bases adjacent to primers7,8,9,10 may have influenced the proportions of sequences in our fish sequence library, resulting in different proportions compared with the original eDNA samples.

Quantitative sequencing can simultaneously quantify DNA from multiple species by adding a single step to the standard HTS procedure. This eliminates the need to establish specific assays for individual target species, which are required with dPCR or qPCR, that involve the design of primers and probes and optimisation of conditions. While qSeq could quantify eDNA, the aquarium experiments in this study did not show a significant correlation between biomass and abundance of eDNA. Future research using qSeq for comprehensive quantification of eDNA in the natural environment will provide a better understanding of the dynamics of eDNA released from organisms by accumulating data on the relationship between eDNA abundance and population numbers.