SIP experiments are usually relatively complex, laborious and time-consuming, and can, therefore, fail because of various reasons and at different stages. Thus, the experimental design of a SIP experiment should be carefully considered in advance and cover all aspects and phases, including preliminary knowledge of the environment and the targeted process, the nature and duration of the incubation, through possible pitfalls and down to the desired method of data analysis. Before deciding on a SIP experiment, it is important to gain some preliminary knowledge of the system in question and the microbial guild to be targeted. For SIP to be successful, sufficient substrate needs to be processed and assimilated by the microbes during the incubation period. Therefore, one of the first and most important preliminary tests to perform is to measure the rate and dynamics of the process in question to estimate the length of the incubation period that is needed. Although the relationship between substrate consumption and level of labelling depends on the assimilation efficiency and the size of the active microbial guild and is therefore difficult to establish, some insights and ballpark estimates can nevertheless be made. Also, it is advisable to measure the enrichment level of the total DNA or RNA extracted from the sample to assess if detection of labelled microbes will be feasible \cite{Angel_2017,Manefield_2002,Blazewicz_2011}. Again, while it is impossible to draw a general direct relation between the level of enrichment of nucleic acids and the outcome of the SIP, because this will depend on whether or not the label is concentrated within a small group of highly labelled microbes or shared amongst many members, but a qualitative relationship can nevertheless easily be drawn for specific environments and microbial guilds.
Which bio-molecule to target
SIP was first designed to identify labelled microbes through the incorporation of a stable isotope into their DNA \cite{Radajewski_2000}. While this is still the most commonly used 'flavour' of SIP, other types of SIP quickly followed, since in essence nearly every stable bio-molecule in the cell can be used as a target for SIP. Targeting DNA is advantageous because DNA is the gold-standard for taxonomic classification of organisms and for hypothesising about potential functions. It is also popular because DNA amplification and sequencing technologies are affordable and wide spread in most molecular and microbiological labs. A protocol for targeting RNA instead of DNA in a SIP experiment \cite{Manefield_2002} then quickly followed. RNA-SIP offers the same taxonomic resolution power as DNA-SIP but because RNA synthesis is uncoupled to cell replication it offers higher sensitivity, though at the cost of a somewhat more laborious and sensitive lab work. A further advantage of RNA-SIP is that unlike DNA, RNA does not migrate based on its G+C content in a density gradient, so the potential for detecting false-positives is theoretically lower (see Sections \ref{292144}, \ref{776350} and \ref{933876} and in Chapter 9 of this book). Targeting PLFA \cite{Boschker_1998} is another popular way for running SIP that even predates the use of DNA-SIP for detecting active microbes in the environment. Because of the use of an isotope-ratio mass spectrometer (IRMS), which is capable of a much finer mass separation compared to density gradient, PLFA-SIP offers significantly higher sensitivity over DNA or RNA SIP, which can be important when studying organisms with very low specific activity such as deep subsurface microorganisms \cite{Pelz_2001} or bacteria that oxidise atmospheric methane \cite{Knief_2003}. However, in addition to excluding the use of 15N-labelled substrates, PLFA inherently offers a much limited capacity for taxonomic affiliation of microbes compared to DNA or RNA and can only differentiate between groups at broad level \cite{Evershed_2006}. Targeting proteins and metabolites is also an option (e.g. \citealt{Baran_2017,Jehmlich_2008}), thus providing a direct and unquestionable proof of processing a labelled substrate. However these methods are very laborious, low throughput and require significant in-house experience in sample processing, and analysis of the output data. Lastly, identification of isotopically labelled microbes at the single-cell levels is also gaining interest lately using tools such as NanoSIMS \cite{Li_2008} and SIP-Raman \cite{Wang_2016} microspectroscopy, however their application is still limited because they are costly, low-throughput and relay on equipment that is found in only a handful of labs around the world.
Duration of incubation
As mentioned, incubation length will depend on the one hand on the rate in which the process in question is proceeding and its specific assimilation efficiency. Incubation in the presence of the labelled substrate should allow enough time for the nucleic acids to become sufficiently labelled to be detected above the background. For very fast processes such as water uptake, incubation time can be as short as a few hours \cite{Blazewicz_2014,Angel_2013}, while for very slow processes, such as nitrogen fixation, incubation can be as long as several days to weeks \cite{Angel_2017,Buckley_2007,Pepe_Ranney_2015}. Incubation time should also vary if targeting DNA or RNA. Labelling of RNA can be detected earlier because it does not require cell replication and because its synthesis is not semi-conservative as DNA replication (although this does not preclude a significant dilution of newly synthesised RNA with light isotope as a result of recycling of building blocks within the cell). In general, it is assumed that DNA or RNA molecules should be labelled to at least 30 atomic % to differentiate them from unlabelled molecules in a BD gradient \cite{Buckley_2007a,Cadisch_2005}. On the other hand, long incubation times bear the risk of labelling community members that do not perform the metabolic activity in question but were labelled through cross-feeding. Because microbes are interlinked through a network of trophic interactions, any labelled element will eventually be spread amongst many members of the community, regardless of how specific the process in question is. Cross-feeding in isotope-labelling experiments has been acknowledged from the start and has been shown for nitrogen as well as carbon (e.g., \cite{McDonald_2005,Adam_2015}). Although typically considered to be an unwanted side effect in SIP experiments, cross-feeding has also been taken advantage of many times to study substrate flow patterns microbial interactions on a temporal scale \cite{DeRito_2005,Pepe_Ranney_2016}. Since cross-feeding in a microbial community cannot simply be put to a halt, the typical way of dealing with this issue is to sample at several time points, limit the incubation time to the minimum necessary for labelling and combine complementary lines of evidence when concluding that a specific taxon indeed performs the metabolism in question.
Substrate enrichment level and concentration
Substrates used in SIP experiments are in almost all cases "fully" labelled, i.e., all positions are enriched with the labelled isotope to the highest level possible (>97 atomic %). This, of course, stems from the need to achieve high levels of labelling in nucleic acids to detect labelled microbes. However, labelling of carbon only at specific positions could also be employed, for example, to study microbial guilds that would attack the substrate at a specific position of interest, while excluding others. The substrate concentration can also affect the rate and strength of labelling, however, presenting a sample with unrealistically high-concentrations can lead to undesired consequences such as drastic community changes or a rapid enrichment of a fast-growing sub-population with low substrate affinity. Therefore it is best to remain within the range (typically on the higher end) of substrate concentrations that are expected to be found in the environment.
Amount of nucleic acids to load
Typical DNA-SIP gradients are prepared with 0.5--5 µg of DNA, but there does not seem to be a hard limit for the amount of DNA that can be loaded on a gradient. For PCR purposes this amount should be more than enough to target the rRNA or any other functional gene. For metagenomic or metatranscriptomic sequencing of the fractions larger amounts of the template will be needed. This can be achieved either by pooling together several fractions from several different gradients or by multiple displacement amplification (e.g., \citealt{Chen_2008}). In RNA-SIP gradients, overloading with RNA will cause aggregation that will prevent efficient separation. The typical recommended amount is around 500 ng for a 5.5 ml gradient \cite{Lueders_2003}. However, this issue was never been studied systematically.
Number of fractions to collect, and sequencing depth
Regardless of which method is used for analysing the data, success in a SIP experiment is determined by the ability to detect microbial phylotypes that are present in the denser fractions of a labelled gradient and are either absent or have lower abundance in the lighter fractions of the same gradient, or in the denser fraction of a control gradient. The detection limit in SIP experiments is itself not a fixed value but will depend on the sequencing depth, the number of fractions being collected from each gradient, and on which method is being used to analyse the data (see Section \ref{316470}). Using state of the art sequencing technologies it is now easy to obtain thousands of sequences per fraction. However, this, of course, comes at a cost, which might not be necessary. It is therefore advisable, if possible, to first obtain an estimate of the size of the microbial guild in question in relation to the total microbial population, using for example qPCR with primers targeting a functional gene or fluorescent microscopy. The smaller the size of the target community, the harder it will be to detect its labelling above the detection limit. Naturally, this will almost inevitably be an overestimation since only a part of the population will be active during the experiment and will eventually incorporate the substrate, but this will at least give a minimum threshold for the sequencing depth needed. The number of fractions collected can also affect the detection limit. While a higher number of fractions will most likely increase the sensitivity, it also entails higher sample processing efforts and costs. In addition, more fractions also mean less template per fraction and thus also an increased difficulty to amplify the target and a higher chance of contamination with foreign nucleic acids from the environment. Typically 12--20 fractions are collected, of which about 10--16 end up being analysed because the lightest and heaviest fractions contain little to no nucleic acids.
Unlabelled controls
As in any lab experiment, appropriate label controls should be set up in parallel to minimise the detection of false-positives. Many of the older published works included only one or two controls, usually at the last time point or at the highest amendment level. Recently, however, particularly with the growing use of high-throughput sequencing and statistical models to detect labelled OTUs the need to include more no-label controls in the experiment to correctly detect labelled phylotypes has been growing, but on the other hand also became easier to achieve. The exact number and type of no-label controls will depend on the exact statistical method used to analyse the data, but also on the type of SIP being performed since DNA-SIP is more prone to detecting false positives than RNA-SIP because of the effect of the G+C-content on DNA BD (see Section \ref{316470}). Ideally, every labelled sample will have its parallel no-label control. However, this is very laborious and costly, and might not be needed. Since RNA-SIP does not suffer from the bias caused by G+C-based migration as in DNA-SIP, it is possible to compare fractions within a gradient, rather than between gradients, and thus reduce the number of controls (see Section \ref{316470}). Similarly, methods that are only interested in identifying labelling of a phylotype (e.g., differential abundance) but not necessarily quantifying it (e.g., qSIP) remain robust even when some controls are omitted (see Section \ref{316470} and Chapter 11).
Type of rotor
Traditionally a vertical rotor was preferred over a fixed-angle one for SIP experiments because it provides a shallower gradient and therefore a higher degree of separation between densities. Recent modelling work suggests, however, that this comes at the cost of a higher diffusion of nucleic acids throughout the gradient (and thus leading to a higher background) \cite{Youngblut_2018}. Both rotor types were successfully used for 15N-SIP, but to date, no experimental comparison was published.