5.4. Interpreting co-occurrence data and networks

Challenges associated with amplicon sequencing analysis and interpretation also complicate the use of co-occurrence network analysis from soil samples. Generally, co-occurrence analysis generates networks with biological species as nodes and edges representing associations between them. Network construction is based on the detection of significant correlations between taxa, and can be used to investigate properties of microbial communities including organismal co-existence (e.g., \citealt{Barber_n_2011}), identification of keystone species (e.g., \citealt{Banerjee2018}) and the stability of community structure (e.g.,  \citealt{de2018,Shi2016}). There has been a recent upsurge in the number of studies including the construction of association networks for soil microbial communities. However, many of these studies have been criticized for their highly descriptive use of networks, that do not allow for an ecological interpretation of detected patterns. 
The difficulty in interpretation stems from inferring causal relationships between taxa based on correlations, which is a long-standing topic of discussion in ecology \citep{Blanchet_2020,Barner_2018}Particularly for soil, it is important to keep in mind that the data contained in each environmental sample is only a snapshot of complex spatio-temporal dynamics (see sections 5.1 and 5.2). As interactions occur at the level of individual microorganisms, inferring interaction among microorganisms in soil is facilitated if samples were taken on the microscale or aggregate scale, rather than on the bulk or horizon scale (see Fig. 4). Independent from scale, any sequencing data from soil capture a noisy signal which reflects several biological processes including: reproduction, death, dispersal, environmental filtering, as well as intra- and inter-specific interactions. The heterogeneity (and resulting sparsity) of amplicon datasets represents an additional confounding effect that may introduce spurious associations, posing additional challenges unique to the study of soil ecosystems. 
For microbiome data, the associations are most often assigned through the detection of significant correlations between relative abundances, where spurious links can be detected if compositional data is not appropriately handled (as explained Section 4). Several popular network construction tools, including SparCC (log ratios) and SPIECEASI (clr), apply log ratios to address compositionality in the process of network construction \citep{Kurtz2015,Friedman2012}. Another option is to convert relative abundances into absolute values by using the total gene copy numbers obtained from qPCR (see section 4). To improve this analysis we suggest a  careful comparison of data with null models to help interpret the results and eliminate some indirect associations between species \citep{Connor_2017}. Additionally, the use of complementary environmental measurement data can improve ecological insights from networks \citep{Goberna2019,Lima_Mendez_2015}. We recommend performing follow-up experiments to further investigate potential interactions to explore inferences made through network analysis. In summary, the field of network inference is rapidly evolving and alternatives are emerging to address currently standing issues. Nevertheless, we still lack a definite framework that allows for a straightforward interpretation of generated co-occurrence networks.