Here, I would make the link to metagenome/metatranscriptome analyses, which also provide hints to absolute abundances or cross-kingdom comparisons, without PCR amplification bias. maybe also mention meta-proteome profiling, although it still suffers childhood diseases such as high costs and low throughput, but can provide other independent insights, not based on nucleic acids  (some new literature on that: doi:  10.3390/microorganisms8111694 , doi.  10.1016/j.jprot.2020.103791, doi  10.1016/j.apsoil.2019.103480  ). 
Any other quantitative approaches that could potentially help? Using soil respiration with biocides, for example, to gain estimate of total size/activity of fungal or bacterial communities, or cellulose-degrading communities etc?

5. Persistent challenges in linking sequences to ecology

Inferring function from taxonomic affiliation/phylogeny

As amplicon sequencing is the detection of a section of a single gene, the taxonomic resolution and ecological insights that can be extracted remain limited. It is critical to consider that taxonomic classifications can be influenced by the reference database selected, many of which remain incomplete due to bias in the types of organisms for which we have reference sequences (51)⁠. Often it occurs that ASVs within a given study are similar to a given taxon at the phylum level taxonomic rank but cannot be described at the higher taxonomic levels. Some studies apply functional predictions using packages such as PICRUST2, Tax4Fun or, in the case of fungi, FunGuild, which suggest that metagenomes (and therefore functional potential of organisms) can be extrapolated from the sequenced amplicon. However, it should be stated that function is not conserved at the phylum level (or even genus level), and therefore processes cannot always be predicted and assigned to taxa using amplicon sequencing in a meaningful way for ecological investigations (52, 53)⁠. For example, assignment of taxa into r-strategists via their taxonomic affiliation with a phylum that is generally assumed to represent fast-growing organisms among soil microbiologists (e.g. Proteobacteria), and using these assumptions to explain processes in soil samples, should be avoided (Jeewani et al. 2020). However, these prediction-based software packages can be used to generate valuable hypotheses for further investigation. In such cases, we recommend to follow up by either FISH-counting of the identified species,  to include functional gene-targeted sequencing, or SIP experiments to learn more about the species or community that is hypothesized to perform an ecosystem process. 

Inference from co-occurrence analysis / Inferring interaction from co-occurrence analysis

All the issues described so far also permeate studies that use amplicon sequencing data for co-occurrence analysis. This analysis consists in checking which species occur together and which ones suppress each other in a high number of environmental samples. For microbiome's datasets the associations are most often assigned trough detection of significant correlations between relative abundances, where spurious links can be detected if compositional data is not appropriately handled as explained in the previous section on quantitative sequencing studies . Here as well, log ratios can be applied to deal with the data set, as done by several popular network construction tools, e.g. SparCC (log ratios) and SPIECEASI (clr) \cite{Kurtz2015,Friedman2012}. Another option available is to convert relative abundances into absolute values by using the total gene copy numbers obtained from qPCR.  As the outcome, these construction tools generally produce networks with biological species as nodes and edges representing associations between them.
Recently we have observed a strong increase in the number of research papers which include construction of association networks for microbial communities. However a large part  of them has been using networks merely as a form of data visualization.  These descriptive works have received criticism since they do not analyse or propose an ecological interpretation of the detected patterns. The difficutly lies in the detection of causal relations among species from co-occurrence patterns, which is a long-standing topic of discussion in ecology \cite{Blanchet_2020,Barner_2018}. Especially  for soil, it is important to keep in mind that the data contained in each environmental sample is only a snapshot of a complex spatio-temporal dynamics. In fact, it captures a noisy signal which reflects several biological processes including: birth, death, dispersal, as well as intra- and inter-specific interactions; all subjected to environmental filtering. Moreover, while interactions occur at the level of individual microorganisms the detectable abundance patterns can only be measured on relatively large and possibly highly heterogeneous soil samples, as mentioned in section on spatial structure of soil. This represents an additional kind of confounding effect that can introduce many spurious associations, posing additional challenges unique to the study of soil ecosystem.
Given that associations lack a direct interpretation, merely descriptive studies rarely can bring new insights to our understanding of microbial communities. Nevertheless,  network  analysis can be very useful to isolate possible interactions between species and to analyze species coexistence. In this context the use of tools from the field of complex systems can be central to formulate and test  hypothesis about how the structure of the microbial community can be linked to its function \cite{Faust_2012,Röttjers2018} , identify important/keystone species   \cite{Banerjee2018} and even make predictions about system's stability to environmental perturbations \cite{de2018,Shi2016}. To improve this analysis we  suggest a  careful comparison with null models \cite{Connor_2017} and complementing it with environmental information\cite{Goberna2019,Lima_Mendez_2015} to help interpret the results and eliminate some indirect associations between species.  In summary, the field of network inference is a rapidly evolving one and we constantly see new alternatives proposed to solve currently standing issues. Nevertheless we still lack a definite framework which allows to generate co-occurrence networks with a straight forward and easy interpretation. 

6. Suggestions for more robust statistical analyses in sequencing studies

Data generated from amplicon sequencing is inherently compositional and provides relative abundances, which are generally independent of the total microbial load of the original sample. It has been previously shown that analyzing compositional datasets with standard statistical techniques (including Pearson correlations or t tests on proportions) can lead to very high (up to 100%) false positive discovery rates (56, 57)⁠. Therefore almost any data set will show significant correlations with microbiome data, given that it represents thousands of different individual variables. The possibility to obtain significant results, therefore, may also lead to an “abuse” of the statistical significance (also referred to “p hacking”). While exploratory analysis is useful, researchers should always remember that an effect or association does not exist just because it was statistically significant, and even more important is that inference should be scientific and not merely statistical. In recent years, the discussion around the abuse of p-values and their importance has risen (58–60)⁠, and some alternative options have been proposed (60)⁠, including the use of more stringent p-values for claims of new discoveries (61, 62)⁠. Clearly the issue is much more complicated than a simple critique to the p-value, but involves scientific research at all levels, including the publish or perish culture insinuated in academic fields, and therefore we address the reader to further explore this topic through the above-mentioned citations.
Nevertheless, the issue of generating false conclusions based on spurious correlation exists, which include the variability inherent in amplicon sequencing data. When adopting a “let’s sequence and see” approach, many correlations (including false positive) will be generated. Given that exploratory research often leads to follow-up research, increasing our confidence will reduce the chances of research born on unsubstantiated findings. Adopting a more stringent p-value threshold will reduce the false positive rate, at the cost of type II errors. In order to avoid this, if we wanted to adopt a more stringent p-value while maintaining statistical power, it was shown that a 70% increase in sample size has to be achieved. We understand that this is often unrealistic, but we also recognize that this could save future efforts born on unsubstantiated research. Instead, current research often focus more often on expanding the depth of analyses on the same few samples at the expense of replication.
Considerations of intraplot variability or number of replicates used to analyze similarities/dissimilarities of microbial communities directly affects the ability to detect differences. To explore how increasing sample size can increase statistical power in soil microbiome analyses, we calculated the dependency of permutational multivariate analysis of variance (PERMANOVA) statistical power to effect size with different number of replicates. Although the data set chosen  \cite{Zheng_2019}  captures a wide range of possible microbial communities, this may not be representative over all possible soil environments. Therefore, we encourage the reader to interpret the data shown only as an example. We used the R package micropower \cite{Kelly_2015} which allows to simulate distance matrices from a set of parameters to generate available PERMANOVA power or necessary sample size for a planned microbiome analysis. We used data from both the 16S rRNA gene and the ITS1 region filtered to include only bacteria and archaea (16S) and fungi (ITS). We calculated the Jaccard similarity index (Supplementary Fig. 1a,b) and used the average and standard deviation across all samples as parameters in the micropower package to simulate OTU/ASV tables with similar parameters. We also calculate the average statistical power (  ω2 ) for a range of effect sizes for the 16S data, defined as 'Low' (0.001-0.04), 'Medium' (0.04-0.08) and 'High' (0.08-0.12). Our analysis indicates that for strong differences in microbial community the number of replicates does not affect the statistical power. By increasing the replicate number from 4 to 5 we were able to almost double the statistical power for small effect size ('Low') and achieve a power above 0.8 for medium effect sizes (Figure 4a). These effects were more pronounced when the number of replicates was doubled (4 to 8; Figure 4b). Similar effects were obtained for the fungal data set (Supplementary Fig. 1c).