2.3 Genome sequencing and assembly
Full methodological details of the genome and transcriptome sequencing
and assembly are provided in Supplementary file S2. In summary, high
molecular weight DNA was sent for PacBio HiFi library preparation with
Pippin Prep and sequencing on one single molecule real-time (SMRT) cell
of the PacBio Sequel II (Australian Genome Research Facility, Brisbane,
Australia). Total RNA was sequenced as 100 bp paired-end reads using
Illumina NovaSeq 6000 with Illumina Stranded mRNA library preparation at
the Ramaciotti Centre for Genomics (University of New South Wales,
Sydney, Australia). Genome assembly was conducted on Galaxy Australia
(The Galaxy Community, 2022) following the genome assembly guide (Price
& Farquharson, 2022) using HiFiasm v0.16.1 with default parameters
(Cheng et al., 2021; Cheng et al., 2022). Transcriptome assembly was
conducted on the University of Sydney High Performance Computer,
Artemis. Genome annotation was performed using FGENESH++ v7.2.2
(Softberry; (Solovyev et al., 2006)) on a Pawsey Supercomputing Centre
Nimbus cloud machine (256 GB RAM, 64 vCPU, 3 TB storage) using the
longest open reading frame predicted from the global transcriptome,
non-mammalian settings, and optimised parameters supplied with theCorvus brachyrhynchos (American crow) gene-finding matrix. The
mitochondrial genome was assembled using MitoHifi v3 (Uliano-Silva et
al., 2023). Benchmarking universal single copy orthologs (BUSCO) was
used to assess genome, transcriptome and annotation completeness (Manni
et al., 2021).