Strengths and Limitations
Our study has multiple strengths. Rather than using a single fetal weight estimate per participant to construct the growth curve as Hadlock did,11 our sex-specific standard is based on longitudinal assessments, with the first EFWs obtained starting at 16 weeks, which is earlier than the Hadlock standard. Our inclusion of only term births in the derivation of the sex-specific equation removed bias that would be introduced by the association of preterm birth with poor growth. Because of this, our sex-specific standard is more representative of expected fetal growth in ongoing pregnancies. A final strength is the assessment of differences in clinical management and outcomes for newborns who were classified differently by the sex-specific standard than by the sex-neutral standard, which provided empiric substantiation of the clinical relevance of the differences between sex-neutral and sex-specific curves.
Our use of a nested cohort to derive and then an expanded cohort to assess the sex-specific standard is valid because this study is different from a traditional derivation-validation approach. In such an approach, separate cohorts are needed because the primary outcome is used to derive the model, making it invalid to test the model’s prediction of the same outcome in the same cohort. In our case, this would be analogous to deriving a fetal growth equation based on its prediction of morbidity and then testing its prediction of morbidity. However, our approach was to derive an equation for fetal growth based on how well it represents available fetal measurements and then assess how designations based on this fetal growth equation are associated with clinical outcomes in the parent cohort. Even so, our analyses of clinical outcomes and management should be interpreted as exploratory and hypothesis-generating rather than as validating.
The primary limitation of our study is that ultrasound EFWs were not collected uniformly across gestation, but were instead concentrated around nuMoM2b study visits such that EFWs collected throughout pregnancy may better represent expected fetal growth. Additionally, sex was ascertained at birth, so our sex-specific curve needs to be validated using a cohort with prenatally identified fetal sex. Further, we cannot rule out that clinical management based on prenatal suspicion of FGR may have introduced bias by lowering clinicians’ thresholds for cesarean delivery. This is plausible, since the group of male newborns considered SGA by the sex-neutral standard had higher cesarean rates for fetal compromise but did not experience concrete morbidity more often than the AGA group. Conversely, clinical action based on prenatal suspicion for FGR may have prevented morbidity, potentially underestimating the true rates of perinatal morbidity among newborns considered SGA by the sex-neutral standard but AGA by the sex-specific standard. This is a less likely explanation for our findings, however, since it is implausible that growth-restricted fetuses who undergo delivery for FGR would have lower rates of morbidity than the AGA group, which is what we found among female newborns. Unfortunately, information on prenatal suspicion for LGA or macrosomia was not collected in the nuMoM2b study so we are unable to determine whether this may have also altered clinical decisions related to mode of delivery.