Sequence data generation
DNA extraction and sequencing are as stated in Brandt et al. (2023). All sequences generated were assembled in CLC Bio Main Workbench Version 6.9 (http://www.clcbio.com). All gene regions were submitted to GenBank and the accession numbers are recorded in Table S1. The CO1, 16S and EF-1ɣ datasets were concatenated using FASconCAT v1.11 (Kück and Meusemann, 2010). The edited sequences for each gene region as well as the concatenated dataset, were aligned using MAFFT online (Katoh, 2005; Katoh & Toh, 2008). The ‘Auto’ strategy for alignment was used in MAFFT, this inspects the direction of the sequences and adjust the alignment in correlation to the first sequence.
Divergence dating
Dating the phylogeny was done using BEAST v1.8.4 . All identical sequences were removed from the analysis. In order to determine the most appropriate fossil calibration points, numerous combinations were tested varying the fossil calibrations used. The final fossil calibration points used were as follows: the Hexathelidae fossil,Rosamygale grauvogely (Gresa-Voltzia formation, France, Triassic) . Rosamygale grauvogely is thus, the first mygalomorph appearance in the fossil record, dating to 250 – 240 MYA . The Nemesiidae fossil,Cretamygale chasei (Isle of Wight, Cretaceous) , two Ctenizidae/Halonoproctidae fossils, Ummidia damzeni and U. malonowskii (both from Baltic amber, Palaeogene) . Lastly, a fossil from the family Cyrtaucheniidae, Bolostromus destructus(Dominican amber, Neogene) . Each family was represented by available sequence data from GenBank for 16S rDNA, cytochrome oxidase 1 (CO1) and elongation factor 1 gamma (EF-1ɣ) (Table S5). Each calibration point was set by setting a hard minimum bound (the youngest possible age of the fossil based on the deposit in which it was found) and a soft upper bound (the oldest possible age).
The nucleotide substitution rate was determined using jModelTest v2.1.7 . This was done for each gene region and the best model to account for varying base pair substitution rates was selected based on the Bayesian information criterion (BIC) . A summary of the number of samples for each gene region, sequence length and substitution model are given in Table S6. This was used to set the prior in BEAUTi v1.8.4 . Other priors set included, the rate of molecular evolution to a relaxed clock with a lognormal distribution (this allows mutational rates to vary over the tree) . As the dataset includes intraspecific and interspecific sampling the study follows the protocol of by testing four different tree priors. All analyses were thus repeated using ‘Speciation: Birth-Death Incomplete Sampling’, ‘Speciation: Birth-Death’, ‘Speciation: Yule Process’ and ‘Coalescent: Constant Size’ at tree priors. This allowed testing at both the population level of diversity and species level diversity. An uncorrelated relaxed clock was used, as this allows for each branch of the phylogeny to have a different evolutionary rate . BEAST was then run for 200,000,000 (250,000,000 for the yule model) generations and sampled every 5000 generations. This was repeated twice for each analysis to ensure convergence. BEAST was run in conjunction with BEAGLE .
Tracer v1.7.1 was used to confirm convergence . This was tested by checking the log file for each BEAST run and ensuring the ESS values were above 200. The individual runs were then compiled to ensure a normal distribution. Log Combiner v1.8.4  was used to combine the multiple tree files into one file. The subsampling number was set to 50,000 and the burn-in for each tree to 50,000,000.
The tree files viewed, annotated and edited in FigTree .