This figure (Figure 2) suggests that with increasing concentration of fine particles in air, the corresponding morbidity (illness) due to ischaemic heart disease also increases in Beijing. The numbers for this graph were generated by using an ecological study where aggregated measures of fine particulate matter and ischaemic heart disease related hospitalisations were analysed. Based on this data, let's say someone already has ischaemic heart disease. Is he more or less likely to get hospitalised due ti IHD on a day when the particulate matters are very high? We cannot say with certainty, because studies where aggregated data are used to make predictions one cannot extrapolate the findings to indiviuals: this is known as ecological fallacy - that is, you cannot extrapolate results from aggregated data to individual cases.
So spurious correlations and ecological fallacy each point to the fact that not all correlations can be causal and for causal linkages, the associations must be valid for individual cases. A valid association must account for three situations:
- The association that we observe must rule out the play of chance. -- We have seen how we can start with theories and set up hypotheses based on the theory and then use the hypotheses to identify the alpha and beta errors so that we can deal with adequate sample size and power estimations that will allow us to rule out the play of chance.
- The association must eliminate all possible biases. -- Biases refer to systematic measurement errors between comparable groups that are studied in an association study. For example, let's say we are planning to study the association between cigarette smoking and lung cancer; for this purpose we have decided to study the prevalence of smoking among those people with lung cancer and those who do not have lung cancer. Further, we have decided to measure their extent of smoking by asking them using a questionnaire. In a setting like this, we can have many different biases. Bias will depend from where we have sourced our participants. If we have selected lung cancer patients from cancer wards and non-cancer patients from say a gymnasium where relatively fit, younger, and health-conscious people attend, we would have introduced a selection bias. This would be a bias where by selecting people in different way who will then be compared the investigator has already introduced a bias in the study. To avoid selection bias, we need to make sure that the people we select for our study in the comparison groups must be as similar as possible. One rule of thumb is that, in a case control study, if the control were to develop the disease of interest, then he would be a case, they should be so similar. So, for the smoking-lung cancer study, we could have selected our controls from another ward in the same hospital (perhaps another inpatient ward) and people with similar age and social profiles. As we have not used an objective measurement for measuring their exposure, this would still introduce other forms of biases. For example, as we are surveying people with and without cancer, the way they may relate their extent of smoking may be different. Perhaps the smokers with cancer are may remember their smoking histories better than non-cancer patients. This form of bias is referred to as "response bias". Biases must be eliminated at the stage of planning the study, as following data collection, the biases may have already occurred, and one can do little about elimination of biases after data collection.
- The association, to be valid, must control for confounding variables. -- The term confounding variable refers to the situation where a third variable exists between the exposure and the outcome (or between the intervention and the outcome) such that (1) this variable is related BOTH to the exposure/intervention AND the outcome variable, and (2) while this variable is related BOTH to the exposure and the outcome, it does not come in any way where a causal path can be constructed between them. For example, imagine we are studying the association between smoking and risk of heart disease. Review by Karen Matthews and colleagues suggest that premenopausal women have lower risk of heart disease compared to men \cite{matthews2009changes}; and as Okene (1993) suggests, prevalence of smoking among women is lower than that among men \cite{ockene1993smoking}. But gender, even though related to both the exposure and outcome, does not come in the causal pathway that may connect smoking with heart disease. Therefore, gender is a confounding variable and must be treated as a confounding variable in any research connecting smoking with heart disease. You can control for confounding variables in the planning stage of your research or at the stage of data collection, and indeed during data analysis. In the planning stage or before collection of data, depending on the type of study, you can (1) randomly allocate your participants into the intervention and control groups (if intervention research such as a randomised controlled trial), or (2) you can restrict your participants to one specific level of the confounding variable (for example if you were to study smoking and heart disease association, you could only work with men); alternatively, you could match the participants on the basis of the confounding variable (you could maintain a 1:1 ratio of including women for both comparison groups in the smoking study); at the stage of data analysis following the study completion, you could stratify the participants according to the categories of the confounding variable and then pool them in the final analysis, or (2) you can conduct multivariable analysis where the confounding variables are entered as variables in the statistical data analysis.
Thus, only after you have ruled out the play of chance by framing the hypotheses, conducting a sample size estimation and power analysis, after eliminating all biases, and after controlling for the effects of confounding variables, if you find that an association persists that is both substantively (that is theoretically acceptable) and statistically significant, then you can claim that a true and independent association exists between the exposure/intervention and the outcome variable. But it still does not answer the question whether such an association is causal or non-causal.
Not all validated associations are causal in nature: as Rothman and Greenland (2005) have argued, most causal models are multifactorial \cite{rothman2005causation}. What this means is that, a disease or an outcome has more than one cause and causes interact with each other. The fractions that we attribute to a specific factor as a cause for an outcome, when we add them together, can be lower than 100% or may be higher than 100%; this is because different causes interact with each other. Some causes are necessary causes such that, in their absence, a disease will not occur. For example, for tuberculosis to occur infection with bacillus tuberculosis is necessary; without the bacilli, signs and symptoms of tuberculosis will not manifest; even then, many people have been infected in the past with tb bacilli but they do not manifest the disease unless they have compromised immunity or some other conditions that would enable manifestation of tuberculosis or other risk factors; for example review Faustini's systematic review on the risk factors for tuberculosis \cite{faustini2006risk}. This suggests that a causal variable can be necessary but rarely sufficient by itself to cause disease. On the other hand, when the necessary causes "team up" with other variables that contribute to the emergence of the disease, a "sufficient causal model" emerges. You can create several sufficient causal models of disease conditions, referred to as "causal pie" models \cite{vineis2006causal}.
So in the context of health sciences, what factor may be considered as a cause for an outcome is not often clear and making that call is fraught with subjective limitations. In 1965, at a conference of occupational hygienists in London, Sir Austin Bradford Hill (1965) proposed a set of nine criteria \cite{bradford1965environment}. Although Sir Hill did not mean these as "criteria" but considerations, these have come to be known as "Hill's Criteria". These considerations help us to assess the nature of an association between an exposure and a disease outcome. These include:
- Strength of Association. -- If an exposure or an intervention is associated with an outcome, then how strong is that association. By strength of association, we mean if we review the odds ratio, or if consider the relative risk (risk of the outcome in the exposed population or those who have received intervention divided by risk of outcome among those who have not received intervention or those who were not exposed), or absolute risk (that is risk in the exposed minus the risk of the outcome in the non-exposed), what is the magnitude? Hill discussed in the context of smoking and lung cancer or heart disease risks and those risks were in the magnitudes of 10 or higher; but the purpose we should look into strength as a measure to judge whether the association is one of cause and effect is that, the stronger an exposure will be with an outcome, the tighter will be the link: you will need to think of another highly prevalent exposure or a stronger factor to have that strong an association. Also, a strong association would indicate that a substantial part of the outcome can be attributed by the exposure.
- Consistency of association. -- If we study the linkage in different populations and different circumstances, do we get similar results? This would suggest that if we were to see very similar pattern of associations in different population groups, that consistency would suggest that this association is not spurious but substantive.
- Specificity of association. -- What this means is, if there are specific contexts where the associations become prominent, there could be a cause and effect linkage. For example, consider a workplace where some workers are exposed to high concentration of environmental tobacco smoke (ETS) and in the same workplace, in other areas are relatively free of the ETS. If we get to see after studying these groups of employees that those who worked in the ETS areas were more likely to suffer from chronic lung disease, then this raises a suspicion about ETS being a causal factor for chronic lung disease.
- Temporality of association. -- If X is a cause of Y, then it makes sense to think that in the chain of causation, X has to happen earlier in time than Y. If that is not the case, then it is hard to justify X as a causal variable. Hill thought that this was like "putting the horse before the cart", and Ken Rothman (2005) contend that this is the strongest clause for cause and effect estimation \cite{Rothman2005}.
- Biological Gradient of the association. -- We know biological gradient as dose-response assessment. This means that as the dose of the exposure increases, or as the intensity or frequency of exposure increases, so will be a corresponding change in the outcome. We can argue that this may not hold true for all cases; particularly for exposures that have ceiling effect (that is where, the exposure reaches a high point), you may not experience a corresponding rise in the outcome as well. Read the work by Philips et.al, (2006) for a comrehensive review on the causal criteria \cite{phillips2006causal}.
- Plausibility as a criterion. -- In his lecture Hill mentioned that it would be "helpful" if the cause would biologically account for the outcome. As we know from cases, this may not always be the case. Quite often, we may not know in advance the biological basis of an association nevertheless there can be a causal linkage. Think of John Snow's investigation of the London Cholera outbreak. While he blamed the water supply of a specific company, at the time of his investigation, no one knew about cholerae vibrio and their roles in the London outbreak. See for example, vandenBroucke's account of John Snow's cholera outbreak investigation \cite{vandenbroucke1991made}
- Coherence as a causal criterion. -- This basically means that are there other instances where similar associations are found? This follows Mill's canons of induction. Read Ducheyne's discussion on the matter \cite{ducheyne2008js}, and as we have seen, if we see that there are similar associations albeit in other domains, then that confirms our ideas that we may be experiencing the action of a causal agent. For example, having known that inorganic arsenic was a known factor for bladder cancer, Smith et.al. (1998) proposed that exposure to inorganic arsenic in drinking water could also be a risk factor for lung cancer \cite{smith1998marked}
- Experiment as a causal criterion. -- If we know that X is a cause of Y, can we set up an experiment where we can introduce or control X and we will see a corresponding reduction or change in the status of Y? Experimental validation is not always possible in case of health sciences, and indeed, in clinical context, we can think of randomised trials where one condition can be deliberately controlled to test whether the association is one of cause and effect
- Analogy as cause. -- Are the exposure and outcomes analogous to some other organisms, other species that produce similar situations with respect to disease causation?
None of these are hard and fast rules for setting up a cause and effect assessment. Each of these nine criteria, except perhaps for the temporality, is open to refutation and criticism. Nevertheless, these provide us with a guidance as to assess whether some factor is associated with something else in the manner of cause and effect. It is judgemental, and qualitative but these provide us ways to think about the association. You will indeed find that in many articles and presentations, authors/researchers refer to these criteria when they assess the cause and effect nature of the associations. Indeed in 'reasoning by abduction' where our goal is to assess "why we get to see the patterns", explanatory models, theories, hypotheses, and discussions of cause and effect associations that differentiate from other types of associations are beneficial for the assessment of the literature.
So this brings us to the beginning of the final topic on our journey to research methods in health: that of what study designs shall we adopt for our study questions. Also, when we assess studies and when we propose our own studies, what study designs might be appropriate and feasible. Let's take a look at the different study designs we can use and their pros, cons, and indications. We will discuss basic features of each study design, where they can be used, what are their advantages and what are their pitfalls that we must be aware of as we either assess in studies where they appear or decide to use them in the context of our own research.
Study designs in Health
It helps to think in terms of either intervention research or observational research. Alvan Feinstein (1997) has argued that this dichotomy ("observational" versus "interventional") is useless as constructs as everything is observational; for instance, do we not conduct observations even in the context of experiments or randomised trials? But here we will therefore use this only in the spirit of separating the notions of "what we do", not so much in the spirit of a container of two different kinds of studies \cite{feinstein1997problems}.
Study designs that are neither strictly interventional studies nor observational but something else
In this class of studies we put systematic reviews, meta analyses, secondary data analyses, and ecological study designs such as time series or spatial studies. These are as follows:
Systematic reviews. -- These refer to a process of (1) framing an answerable question; (2) constructing a search algorithm to identify relevant studies, (3) selecting and rejecting studies based on criteria that the authors/researchers set up before the beginning of the study; (4) abstracting information from individual studies using a plan on a spreadsheet or tables; (5) assessing the methodological qualities of the studies, (6) summarising the results of the individual studies to a series of answers to the questions that the authors set out to study. Systematic reviews are best used when you are starting out or when some investigations are already done by others and you want to summarise "what is out there". It is always a good idea to systematically summarise the key messages. It is a starting point for most research. However, the quality of an SR is as good as the constituent studies; when you conduct SRs, you are also limited by publication bias where you are biased by only those studies that are published and have positive results; you will need to identify other studies and fugitive literature (studies that are not published) for robust systematic reviews. See Campbell Collaboration website to learn more about systematic reviews here
\cite{collaboration}:
https://campbellcollaboration.org/ Meta analyses. -- These class of studies are virtually identical with systematic reviews and follow the steps till step 5; then the summary of results in meta analyses follow statistical procedures where the studies are first assessed whether they are "homogeneous or heterogeneous" -- these terms imply whether the studies on the basis of the population, or the results belong as if they are similar in a way. Accordingly a number of different ways in which the summarised numerical results are presented. The results of individual studies are combined statistically to arrive at a summary estimate. Like systematic reviews, these studies are open to publication bias. Meta analyses are conducted when you have data on randomised controlled trials on interventions for specific health outcomes. See for example Cochrane Collaboration (URL:
http://www.cochrane.org/) to learn about how meta analyses are used
\cite{collaborationa}.
Secondary analysis of data. -- Large data repositories are increasingly becoming available where you can access data sets other researchers have collected or governments have collected and they have made them available for researchers all over the world. You can use tools of biostatistics and data analysis to mine those data bases. Kaggle for example, provides data sets that you can use to analyse (see URL:
https://www.kaggle.com/); in the early stages of your thinking about a topic, it is a good idea to delve into secondary data sources and analyse them. It is also a good idea to test theories and hypotheses when you cannot collect your own data or find data collection very expensive. Secondary analyses of data are excellent sources for developing theories and validating theories.
Ecological studies. -- These are a subset of secondary analyses. Government data bases allow you to analyse associations between various factors: for example, you can study area level deprivation and health outcomes in New Zealand and this will allow you to understand and test theories on association between poverty and health issues for instance. The limitation of ecological studies and secondary data analyses that are based on aggregates of data are ecological fallacy where you cannot extrapolate data on the basis of aggregates to the individual cases.
Primary interventional study designs in health care
Randomised controlled trials. -- These are primary studies in the sense that you collect data from individual participants; these are interventions where you randomly allocate participants into intervention arm and control arm. The studies can be blinded so that you may not know which participant received the intervention and which participant received the control condition (single blind), or you can be blinded as well as the participant (double blind trials); randomisation and blinding ensure that you have taken care of confounding and bias inherent in the study design. The effect estimate is in the form of absolute risk or risk reduction (risk of outcome among those who received the intervention - risk of outcome among those who received the control condition), or in the form of relative risk estimates. RCTs are conducted in clinical contexts or even in population level or community level interventions to test the efficacy of interventions under "controlled conditions". These studies are as close as possible to the experiments on humans. The studies are limited by the fact that these are expensive studies and the results are not generalisable to all members of the public: the results of RCTs apply to individuals whose profiles match closely the profiles of those who were included in these studies. This is why RCTs and intervention trials have limited "external validity" while they have excellent internal validity.
Study designs that are mainly observational in nature
Prospective cohort studies. -- These are classes of observational epidemiological studies where the investigator assigns the participants of the study into groups where the participants are exposed or non-exposed to an exposure of interest. The investigators then follow up the participants over a period of time till the participants develop the outcome of interest. The rates at which those who are exposed and those who are non-exposed develop the outcome of interest are compared and these comparisons form the effect estimate. The effect estimates are either absolute difference in the rates at which the outcomes occur (ARR or absolute risk reduction or Attributable Risk), or Relative Risks (the ratio of the rate of occurrence of the outcome in the exposed versus in the non-exposed), or Hazard Ratios (hazards are defined as instantaneous risks of occurrence of health events and therefore hazard ratios are the ratios of the hazard of the outcomes among the exposed and the non-exposed. In a prospective cohort study, the participants are identified and followed up in present time and the outcomes may occur at a later date. For example: the researcher is interested to study the incidence of acute respiratory tract infection in children who are born in regions that are close to gold mining areas or away from gold mining regions in a country (the idea being to test if being exposed to a mining environment is associated with ARI in infants). So they followed children (exposed cohort) who were born and lived for the first year of their lives in gold mining areas and those children who were born and lived in the first year of their lives (as infants) in non-gold mining areas and noted the relative incidence of acute respiratory tract infections in both cohorts. Prospective cohort studies are useful for studying multiple exposures and commonly occurring illnesses or health conditions; they are useful study designs to assess causal linkages between exposure and outcomes; however, they are time consuming and expensive and not suited for studying rare disease occurrences.
Retrospective cohort studies. -- Retrospective cohort studies share the similarity with the prospective cohort studies that in both types of study designs the investigators follow up individuals with defined exposure status (these individuals are referred to as "cohorts", so there are exposed and non-exposed cohorts), but in case of retrospective cohort studies, the exposure and the outcome have already occurred in a past time frame. The investigators know and have identified on the basis of pre-existing records as to who are exposed cohorts and who are non-exposed cohorts and can recreate the emergence of health effects. Retrospective cohort studies are used in industrial and workplace setting and in occupational epidemiological settings to study the emergence of health outcomes among individuals exposed to different levels of a toxin in the workplace. Like prospective epidemiological studies, this is a good study design for causal inference; however, as in the case of prospective cohort studies, these studies can be expensive and you can only study common outcomes (so, for instance, occupational cancers are not good candidate research topics for retrospective cohort studies).
Case control study design. -- Case control study designs are used to study the likelihood of exposure for individuals with known states of health outcomes. In case control studies, researchers first identify identify individuals with a disease condition ("cases") and similar individuals selected on the basis of potential confounding variables, who are free of the disease under investigation ("controls"). For both cases and controls, the investigators then study the likelihood of their past exposure. The likelihood of past exposure is referred to as Odds of exposure; the odds of exposure for cases are compared with the odds of exposure for controls, and hence the effect estimate is odds ratio of exposure for cases and controls. Case control studies are used to study rare diseases (where you already know the status of the individuals with the disease and without the disease), and they can be used to study multiple exposures for a simple set of outcomes. Case control study designs are perhaps the most widely used epidemiological study designs where the investigators can identify individuals with and without disease states and then can identify the likelihood of exposure for each group. Case control studies are used for Cancer Epidemiological studies as cancers are relatively rare disease conditions (rare diseases are assumed to be diseases that occur in 1 per 10, 000 individuals or less frequency); case control study designs are open to response bias and it is difficult to assess causal linkages using only case control study designs. For example, if you set up a case control study design to assess the association between smoking and heart disease, and find that those individuals with heart disease have lower rates of smoking, you will not be able to know for sure whether the lower rates of smoking among individuals with heart disease occurred after they found out that they had heart disease or not.
Cross sectional surveys. -- Cross sectional surveys are also referred to as prevalence studies. In cross sectional surveys, investigators study at a single given point in time or a single given time period a group of individuals (or a sample of individuals) on disease outcomes and their exposure status. The study design is suited to estimate the prevalence of a health state. For example, you can use a cross sectional survey on a group of individuals to estimate the prevalence of diabetes in Christchurch or Canterbury or even whole of New Zealand. The cross-sectional studies can also be used to model prevalence odds ratios for those individuals with a defined disease state and those without the disease. However, cross sectional surveys are limited by the responses or measurement errors on parts of the investigators and respondents or participants. Cross-sectional surveys are also poor study designs if you want to assess cause and effect associations between exposure and outcomes because you are going to obtain the exposure and outcome information from the study participants at the same time. As temporality is a major determinant of causation, therefore you cannot be certain if the exposure definitely preceded the outcome. Other than that, Cross sectional studies are relatively inexpensive in terms of the amount of resources they consume and the time you will take to set up such a study.
Case series. -- The previous study designs can be used for studying cause and effect relationships between an exposure and an outcome (or an intervention and an outcome); this is because each of these study designs include a valid comparison group in one way or another: controls who receive standard interventions or placebos in RCTs, non-exposed groups or cohorts in cohort studies, controls in case control study design, and indeed, among individuals who are either interviewed or measured for specific health events or effects and exposures in cross sectional study designs. Case series, unlike these study designs, is about description of individuals with specific exposures and/or health effects. In case series, a number of individuals are described with respect to their disease conditions either at one point in time or over time periods. Surveillance conducted by governments and health departments to study emergence of disease conditions (both infectious disease surveillance and non-infectious and chronic disease surveillance) are examples of case series in action for public health. Case series is inexpensive and the case series study design helps to generate hypotheses or form the beginning of abductive reasoning that leads to answers for further questions about why we observe the pattern we observe using a case series as study designs. Case series are not good study designs if our aim is to find an answer "why" or "how" of a disease or disease-exposure association. Otherwise case series are relatively inexpensive and the least resource intensive of all the study designs we have covered so far.
So abductive reasoning is about answering an explanation for a series of events we observe: the explanation is first offered on the basis of intuition or conjecture. Based on the explanation, the researcher frames a theory and based on the theory, the researcher then sets up rival hypotheses. The hypothesis that immediately follows the theory is termed as "alternative hypothesis"; the hypothesis that challenges the alternative hypothesis and is related to a state of status quo, is termed as "null hypothesis". The rules of framing a theory is that all relevant observations must be taken into account and theory must be able to explain EVERY fact that are observed. More than one theories are put up for examination; the theory that fits all observations and is also the simplest or involves least number of parameters, being simple, conforms to the Ockham's razor principle and wins; but this is not hard and fast. As a researcher, another goal is to find out a fact or evidence that refutes the theory and comes up with an observation that is in contradiction to the predictions of this theory: this principle is therefore referred to as "conjecture and refutation". Using abductive logic, a researcher will claim that he or she cannot prove a theory but only fail to disprove the theory.
Now that we have covered the three classes of reasonings (deductive, inductive and abductive), let us put everything we have learned so far into practical use. We will do this (1) first by outlining ways in which we can ask questions and abstract information from papers we read, and (2) how we can use this approach to develop research proposals.
First: How to read and assess research
The first thing you should do is to think of a research idea. What is a research idea? We define here or describe here a resesarch idea as one that needs an explanation. In order to do so, we need to take stock of existing knowledge base and we should have a way to address these first. Hence I recommend the following steps.
Step 1. Run a search and do forward and backward citation tracing, reach a concept saturation point, and review the articles
We would like to study the association between heat waves and morbidity and mortality. We ran a search on Google Scholar (URL:
http://scholar.google.com) without restriction on any date. Note the following two figures: