Figure legends
Figure 1. Origin of P. vivax genomes per country included in the
analysis. Size of the dots are proportional to the number of samples in
the genome dataset, and the colours indicate the country. Dots are
plotted at the centre of the country (as defined by the ggmap package in
R).
Figure 2 : Global P. vivax phylogeny, admixture
and population structure. (A) Principal Component analysis based on the
LD-pruned biallelic SNPs using PLINK2, showing the first two principal
components. The samples (dots) are coloured according to the originating
population (here region). (B) Phylogenetic tree based on the
LD- pruned biallelic SNPs using RAxML. (C) Admixture
proportions for K=10 populations using the ADMIXTURE software, with in
the small bar on top the region of origin, (AFR = Africa, ESEA= Eastern
South East Asia, LAM=Latin America, MSEA= Middle South East Asia, OCE=
Oceania, WAS= Western Asia).
Figure 3. Spatio-temporal population dynamics in Latin America.Admixture analysis of P. vivax samples from LAM, using K=11 populations.(A) Bar plot with admixture proportions of each sample for each
ancestry cluster, with in the small bar on top the country of origin for
each sample. (B) Each sample is assigned to one ancestry
cluster based on the highest membership probability to that population
in the admixture analysis. Pie charts represent the number of samples
from each cluster in that country and year.
Figure 4. P. vivax IBD-based connectivity between countries in
Latin America. Connectivity network of inferred IBD between P. vivax
samples from Latin American countries. Edges connecting parasite pairs
indicate that at least 10% of their genomes descended from a common
ancestor without intervening recombination, indicating distant to close
relatedness. Node colours indicate the country of origin of the P. vivax
genomes, and nodes were plotted on the map with known latitude and
longitude of collection sites by district or if unknown in the
respective country’s capital (for example in Guyana).
Figure 5. Molecular markers for transmission intensity. From
the studied countries, the number of P. vivax cases (A) has
been highest in Brazil in all years (2010-2021), data extracted from the
World Malaria report 2022 . (B) Violin plot of nucleotide
diversity (pi) measured across the genome in 5000 bp windows(C) Violin plot of within-host infection complexity assessed
using within sample F statistic (FWS).
FWS ≥ 0.95 was considered a proxy for a monoclonal
infection.
Figure 6 Pairwise IBD between isolates across the 5
populations in LAM. (A) Line plot of median IBD shared between
pairs of P. vivax samples along the chromosomes. IBD segments with
highly significant IBD are indicated in orange and annotated genes of
interest at peaks of IBD-sharing are labelled. The top genes that share
significant IBD in the populations are listed in Supplementary table 2.
Labelled dots in red indicate the positions and level of IBD sharing of
putative drug resistant associated genes (list from ). (B)Heatmap of significant pairwise IBD between populations in LAM are
clustered on rows for similar patterns between populations. Low to high
-log10 p-values indicating significance levels of IBD sharing are colour
graded from blue to yellow. Significant IBD-sharing is seen at a -log10
p-value greater than 1.3 (i.e. p<0.05), and a threshold of
-log10 p-value >10 was used to identify highly significant
areas of IBD-sharing.