Methods

Protocol registration

The detailed protocol was prospectively registered with COMET (Registration Id: 1489) (http://www.comet-initiative.org/) and submitted to a peer reviewed journal. The study is reported according to the PRISMA statement (Figure S1).10

Study selection

Studies describing completed core outcome sets specific to maternal or neonatal health were included if they developed or applied methodology for determining which outcomes should be measured, or are important to measure in clinical trials, research or clinical practice. Studies were eligible if they related to pregnant or postpartum participants (up to 12 months postpartum), or neonates/infants (where the first outcome measurement is recommended within the first 28 days of birth), with any health condition and in any setting. Published conference papers were included if they provided adequate information about the completed COS or where a primary paper provided further information.

Excluded studies

Consistent with Gargon et al. papers were excluded if the design or rationale reported (i) a single study; (ii) related to pre-clinical or early phase trials only; (iii) reported the use of a COS; (iv) a systematic review of clinical trials; (v) studies of prognosis; (vi) studies of outcomes measured in clinical trials or quantitative descriptions of outcomes; (vii) based on the opinion of a single author; (viii) or focused on one domain/outcome only.5 We excluded papers relating to early pregnancy loss (prior to 20 weeks gestation) as these were considered gynaecological rather than maternity.

Information sources

We searched the COMET11 and CROWN4registers and ICHOM (International Consortium for Health Outcomes Measurement) list of standard sets.12 We conducted an electronic database search of MEDLINE (via Ovid), EMBASE and CINAHL Complete (via EBSCOhost) in January 2020. Studies reported in English language were included from inception to January 2020. All ongoing COS identified in the COMET register and in previous reviews2,5,6,13-17 were searched for updates of progress. Hand-searching reference lists complimented the search.

Search

A university health librarian helped to develop and pilot the search strategy. Our search terms combined three concepts (‘core outcome set’, ‘methodology’ and ‘population’). All terms within each concept were combined with the Boolean operator OR and then the three concepts with AND. Search terms are outlined in Appendix S1.

Study selection and data management

Endnote software X8 was used to screen all citations. Duplicates were identified and removed. Titles and abstracts were screened by one author (VS). Full text papers were reviewed for all studies meeting the inclusion criteria. Papers not meeting the eligibility criteria were excluded and reason recorded. Full paper screening was conducted independently by one researcher (VS). Ten percent of included and excluded papers were assessed by a second reviewer (DC). Any disagreement between reviewers or uncertainty were resolved by consensus or by arbitration using a third reviewer (JG).

Data extraction

Data were extracted by one author (VS) using extraction forms guided by criteria outlined in previous reviews:2,6 author(s), year of publication, COMET registration number, disease category, disease name, related papers, funder, CROWN registration, publication type, each item as defined on the COS-STAR statement18, scope, stakeholder involvement, geographical location of stakeholders, patient participation, consensus process, final list of outcomes/domains, and measurement recommendations.

Sources of information

Supporting data was collected from primary COS papers, relevant project papers (systematic reviews and protocols) and from the COMET and CROWN registers.

Assessment of study against minimum standards

Included studies were assessed against COS-STAD minimum standards.8 COS-STAD contains 11 standards covering three key domains (scope, stakeholders, and consensus process). Consistent with others,9 item 9 (‘A scoring process and consensus definition is described a priori ’) was modified to include two assessment items for scoring process (termed 9a) and consensus definition (termed 9b). Each item was assessed as standard met, unclear, or not met using the assessment criteria outlined by Gargon and colleagues.9

Synthesis of results

Findings were described descriptively using text and tables and summarised as frequencies and percentages.