Which Of The Following Combinations Correctly Matches A Clade Of Animal To Its Description?

Sci Adv. 2020 December; 6(50): eabc5162.

Topology-dependent asymmetry in systematic errors affects phylogenetic placement of Ctenophora and Xenacoelomorpha

Received 2020 April 28; Accepted 2020 October 27.

Supplementary Materials: http://advances.sciencemag.org/cgi/content/total/6/50/eabc5162/DC1

GUID: B7BDC0DF-E75D-4E76-AC0B-DA8B2CC9AC51

GUID: B48B463C-6FAB-470B-A8F1-FEC26196CC38

Adobe PDF - abc5162_SM.pdf

GUID: B784A2DE-D0F0-45BE-BB44-1FC2BF546915

Topology-dependent asymmetry in systematic errors affects phylogenetic placement of Ctenophora and Xenacoelomorpha

GUID: B7BDC0DF-E75D-4E76-AC0B-DA8B2CC9AC51

Abstract

The evolutionary relationships of two animal phyla, Ctenophora and Xenacoelomorpha, take proved highly contentious. Ctenophora have been proposed as the nigh distant relatives of all other animals (Ctenophora-first rather than the traditional Porifera-start). Xenacoelomorpha may be primitively uncomplicated relatives of all other bilaterally symmetrical animals (Nephrozoa) or simplified relatives of echinoderms and hemichordates (Xenambulacraria). In both cases, ane of the culling topologies must exist a result of errors in tree reconstruction. Here, using empirical information and simulations, we prove that the Ctenophora-start and Nephrozoa topologies (only not Porifera-first and Ambulacraria topologies) are strongly supported by analyses affected by systematic errors. All-around this finding suggests that empirical studies supporting Ctenophora-first and Nephrozoa trees are probable to be explained past systematic error. This would imply that the alternative Porifera-first and Xenambulacraria topologies, which are supported by analyses designed to minimize systematic error, are the nigh apparent current alternatives.

INTRODUCTION

Knowing the relationships between major groups of animals is essential for understanding the earliest events in brute evolution. While the apply of molecular information has led to considerable progress in agreement animal relationships, some aspects remain highly disputed. The positions of two animal phyla, Ctenophora (sea gooseberries) and Xenacoelomorpha (simple marine worms including xenoturbellids and acoelomorph worms), have proved particularly contentious. Ctenophora accept been proposed as the about distant relatives of all other animals (Ctenophora-first topology), although they take muscles, nerves, and other characters absent-minded in sponges (classically the sister group of other animals: Porifera-start topology). Xenacoelomorphs may exist the sister group of all other bilaterians (they are simple and lack characteristics of other Bilateria: Nephrozoa topology); alternatively, they have been linked to the circuitous Ambulacraria (Hemichordata and Echinodermata: Xenambulacraria topology), implying loss of complication.

While the Ctenophora-first and the Nephrozoa topologies (Fig. 1) take gained support in contained analyses of big datasets (i–six), supporters of the alternative topologies (Porifera-first and Xenambulacraria; Fig. 1) have suggested that there is an expectation of a long-co-operative attraction (LBA) artifact (7–ix). LBA is a systematic error that falsely groups long branches (10), such as those leading to the outgroups and to both the Ctenophora and Xenacoelomorpha. LBA can be exacerbated by the employ of commutation models that do not account for heterogeneities in sequence development such as nonhomogeneous rates of substitution between alignment sites or heterogeneities in the frequencies of amino acids across the alignment (11–fourteen). Attraction between the long branches leading to the Ctenophora and Xenacoelomorpha and to their respective outgroups could issue in the Ctenophora-first and Nephrozoa trees. The support seen for Ctenophora-commencement and Nephrozoa topologies, yet, is generally stronger than for the alternatives (8, 9).

An external file that holds a picture, illustration, etc. Object name is abc5162-F1.jpg

Phylogenetic tree of main animal groups highlighting culling hypotheses for the positions of the Ctenophora and Xenacoelomorpha.

The dotted lines show culling positions for the Ctenophora and Xenacoelomorpha. The sis grouping of all other Metazoa could be Ctenophora (Ctenophora-starting time) or the Porifera (Porifera-showtime). Xenacoelomorpha could exist the sis grouping of the Ambulacraria (Xenambulacraria hypothesis), or the Xenacoelomorpha could be the sister grouping of all other Bilateria (Nephrozoa hypothesis). Co-operative lengths are approximately proportional to the average co-operative lengths leading to the clades indicated. Long branches leading to Ctenophora and Xenacoelomorpha are evident. The Chordata are shown as a sister group of the Protostomia: a topology supported by the dataset used in our analyses (9).

We have used recently published phylogenomic datasets that were designed to place either the ctenophores [Simion et al. (viii); dataset "Simion-all"] or the xenacoelomorphs [Philippe et al. (9) and Cannon et al. (five); datasets "Philippe-all" and "Cannon"] in the animal tree. We enquire whether unaccounted-for across-site heterogeneity in amino acrid composition might have resulted in model violations that could pb to the underestimation of the prevalence of convergent evolution. We and so ask whether ignoring such heterogeneity could event in LBA and incorrect support for the Ctenophora-first and Nephrozoa trees.

RESULTS

Event of model misspecification on accuracy of branch length interpretation

Nosotros commencement used Bayesian inference (15) to assess the difference in branch length estimates nether site-heterogeneous (i.e., Cat + LG + G) and site-homogeneous (i.eastward., LG + Thou) models. Nosotros used a fixed topology to reduce the computational burden and performed the calculations on a subset of the original information (30,000 sites). Nosotros detect that branch lengths estimated under a site-homogeneous model are consistently shorter than those estimated under a site-heterogeneous model for both Simion-all and Philippe-all datasets. Notably, the discrepancy between branch lengths estimated using homogeneous and heterogeneous models was proportionally larger for longer branches (Fig. 2).

An external file that holds a picture, illustration, etc. Object name is abc5162-F2.jpg

Site-homogeneous models consistently underestimate branch lengths.

(A) Tree showing the clades (names) and branches (letters) for which lengths were estimated using Philippe-all data. (B) Estimates of clade and co-operative lengths using site-heterogeneous model (bluish) and site-homogeneous model (chocolate-brown) based on empirical information. (C) Tree showing the clades and branches for which lengths were estimated using Simion-all information. (D) Estimates of clade and branch lengths using site-heterogeneous (blue) model and site-homogeneous model (brownish) based on empirical data. Site-homogeneous models consistently estimate shorter branch lengths.

It is possible that site-homogeneous models have underestimated or that site-heterogenous models overestimated co-operative lengths (or both could exist true). To evaluate which of these possibilities is responsible for the observed differences, nosotros imitation amino acid sequences using empirically calibrated parameter values (i.e., values estimated from empirical data) under both site-homogeneous and site-heterogeneous models. We assessed the efficiency of both model types in estimating the known branch lengths correctly. Both models give accurate estimates of branch lengths for information that have evolved homogeneously, showing that site-heterogenous models exercise not systematically overestimate branch lengths. For data that have evolved heterogeneously, yet, site-homogeneous models consistently underestimate branch lengths (Fig. three). This result shows that using site-homogeneous models will consistently underestimate branch lengths for information that have evolved heterogeneously. The discrepancy between the co-operative length estimates using the ii models for site-heterogeneous data resembles the discrepancy that we observe for the empirical data (Fig. 2). The demand to accommodate site heterogeneity is also supported past cross-validation tests [this study and run into (9)], which show that a site-heterogeneous model fits these data significantly amend than a site-homogeneous model (Fig. 3).

An external file that holds a picture, illustration, etc. Object name is abc5162-F3.jpg

Site-heterogeneous models estimate accurate branch lengths for both site-homogeneous and site-heterogeneous data.

(A) The guide tree that was used for simulating the data, showing the clades (names) and branches (letters) used for comparison the estimates beyond models. (B) Estimates of clade and branch lengths for data fake nether the site-homogeneous model and inferred using the site-heterogeneous model (blueish) and site-homogeneous model (brown). Both models give similarly authentic branch lengths for information fake with the site-homogeneous model; the true branch lengths are shown with the black lines. (C) Equivalent estimates of clade and branch lengths for data false under the site-heterogeneous model. Site-heterogeneous models requite accurate estimates, whereas site-homogeneous models consistently underestimate the corporeality of change/branch length.

Upshot of model misspecification on topology

While the co-operative lengths in both datasets are shown to exist consistently underestimated under a homogeneous model, it need not follow that this will bear upon our ability to reconstruct the tree topology correctly. To come across the outcome of model violations on topology, we simulated information (100 replicates) using a site-heterogeneous model under each of the ii culling topologies for each dataset (i.e., Nephrozoa/Xenambulacraria and Ctenophora-outset/Porifera-starting time). We performed the simulations with PhyloBayes (fifteen) using parameters estimated under the site-heterogeneous model and the relevant topological hypothesis. For each simulated dataset, we inferred two maximum likelihood phylogenies using IQ-TREE (xvi): (i) under an empirical site-heterogeneous model (C60 + LG + G) that serves every bit an approximation of the CAT model ((17) and (two) under a site-homogeneous model (LG + G). We also performed bootstrap analyses using the site-homogeneous model (LG + G) and the more than complex site-heterogeneous model (C60 + LG + G) to assess the robustness of the inferred phylogenies with respect to the nodes of interest.

For datasets fake under the potential LBA topologies (Nephrozoa and Ctenophora-first), nosotros find that the right tree was reconstructed in 100% of cases even under model violation (Fig. iv) and with 100% bootstrap support (Fig. 5). In contrast, the data simulated nether the alternative Xenambulacraria and Porifera-first topologies yielded incorrect topologies nether the site-homogeneous model (in ninety and 98% of simulations, respectively). In all cases, the incorrect topology was the putative LBA topology (Fig. 4 and table S1), and this topology was almost always supported past high bootstrap values (Fig. five). The same data analyzed nether the site-heterogeneous C60 + LG + Yard model recovered the correct tree 95% of the fourth dimension for Xenambulacraria and 88% of the time for Porifera-first, with loftier bootstrap support (Fig. 5).

An external file that holds a picture, illustration, etc. Object name is abc5162-F4.jpg

Topology-dependent asymmetry of the ability of model-violating site-homogeneous models to reconstruct the correct tree.

(A) A total of 100 datasets were false using a site-heterogeneous model for each of the topologies shown (orangish/grayness boxes). (B) For the datasets based on the whole alignment, site-heterogeneous (top) and site-homogeneous models (bottom) were used to reconstruct a maximum likelihood tree. The proportion of times the orange or blackness tree was reconstructed is shown in the bar charts. Information imitation nether the Nephrozoa and the Ctenophora-beginning trees always yield the right topology regardless of the model. Data fake nether the Xenambulacraria and Porifera-first topologies mostly yield the correct topology nether the site-heterogeneous model but an incorrect topology under the site-homogeneous model. The incorrect tree is ever Nephrozoa and the Ctenophora-commencement, respectively. (C) The experiments were repeated for the datasets based on the sets of genes best and worst at reconstructing known clades. For the best genes under both models, the inference is improved for data false under Xenambulacraria and Porifera-first topologies. A decrease in the performance of both models is observed using the worst data.

An external file that holds a picture, illustration, etc. Object name is abc5162-F5.jpg

Topology-dependent asymmetry of tree reconstruction analyses shown using bootstrap.

(A) A total of 100 datasets were simulated using a site-heterogeneous model for each of topologies shown in the respective box plots (orange/gray boxes). (B) Site-heterogeneous (top) and site-homogeneous models (bottom) were used to reconstruct a maximum likelihood tree, with the bootstrap support measured. The bootstrap support values showing the support of either gray or orange topologies are shown in the bar charts. Data simulated under the Nephrozoa and the Ctenophora-first trees always yield the right topology regardless of the model with 100% bootstrap support. Data simulated nether the Xenambulacraria and Porifera-first topologies mostly yield the correct topology under the site-heterogeneous model but an wrong topology nether the site-homogeneous model. The incorrect tree is always Nephrozoa or Ctenophora-first, respectively.

This marked disproportion of the effects of model misspecification on our ability to recover the two topologies implies that, using these data, we are highly unlikely to reconstruct the Porifera-first or Xenambulacraria trees in error if the alternative topologies are true. We evidence, all the same, that model misspecification leading to branch length underestimation is highly likely to upshot in a failure to reconstruct the Porifera-first or Xenambulacraria topologies correctly. Nosotros repeated these experiments constraining the deuterostomes to be monophyletic [a topology not supported by the Philippe information (ix)] as well equally using the Cannon data (five), which do support monophyletic deuterostomes. While Xenambulacraria is more easily recovered nether these weather (the alternative Nephrozoa is less easily recovered in error), we withal detect that for information false under the Nephrozoa topology with monophyletic deuterostomes, we never recover the Xenambulacraria tree in error (table S1).

For data false under a homogeneous model, all four topologies were ever correctly reconstructed using either model. If the country frequencies were homogeneous across the alignment, we should expect no errors in phylogenetic inference regardless of the true topology (tabular array S1).

Measuring the caste of asymmetry

To mensurate the asymmetry in the ease with which the Nephrozoa and Xenambulacraria topologies can exist reconstructed, we made datasets composed of different proportions of data false nether the 2 topologies. If at that place were no asymmetry, datasets composed of fifty% from each simulation should support each topology ~50% of the time. We find instead that reconstructing copse based on counterbalanced (50/l) datasets and using the site-heterogeneous models outcome in support for Nephrozoa 100% of the time. Only when nosotros increase the proportion of data coming from simulations based on the Xenambulacraria tree to reach xc% Xenambulacraria versus ten% Nephrozoa does support for Xenambulacraria outweigh support for Nephrozoa (tabular array S2). The bias favoring Nephrozoa is fifty-fifty stronger for datasets constraining the deuterostomes to exist monophyletic (tabular array S2).

Effect of long final branches on topology

Long terminal branches and unaccommodated heterogeneities are of import conditions contributing to the existence of LBA. To investigate the impact of long branches in the Xenacoelomorpha example, we repeated our simulation experiment having first removed the long-branched Acoelomorpha, leaving merely the relatively brusk-branched Xenoturbella. This is common exercise for reducing LBA artifacts in empirical studies (18–20). All sampled ctenophores are found at the cease of a long branch, significant that this taxon trimming experiment could not be done in this case. Consistent with the importance of a contribution from extreme long branches to the LBA error, once this had been reduced past removing acoelomorphs, the Xenambulacraria topology was recovered for 52% of the simulated alignments even nether model violation. Under the site-heterogeneous model, the correct topology was recovered more ofttimes (89%), although two of the simulated datasets yielded an alternative erroneous placement of Xenacoelomorpha, i.eastward., every bit sis to Protostomia and Chordata. In simulations using the Nephrozoa topology, removing the long-branched acoelomorphs had no upshot on our ability to reconstruct the correct tree using either model. Overall, this experiment suggests that, if the Xenambulacraria tree is right, then the observed long branches leading to the Acoelomorpha would contribute to artifactual back up for the Nephrozoa tree and this recapitulates findings with empirical data (tabular array S1) (9).

Effect of short internal branches on topology

Other conditions can as well exacerbate the phenomenon of LBA, which is expected to be more prevalent for trees with few informative changes along the internal branches separating the clades of interest (short internal branches). Nosotros tested this by examining the ability of subsets of genes containing shorter/longer internal branches to resist LBA artifacts.

Philippe et al. ranked their 1173 genes from those most able to reconstruct known clades ("all-time") to those to the lowest degree able ("worst") on the ground of the ability of each gene to reconstruct known clades as monophyletic (monophyly score). We made estimates of last and internal branch lengths from 30,000 alignment positions randomly fatigued from the sets of 25% best and 25% worst genes and compare these to the estimates that we have described for 30,000 positions drawn randomly from the whole alignment. We show that the reduction in tree length in the best quarter of genes (9) comes from a reduction of the terminal branches simply non of the internal. We discover that the internal branches are longer when estimated using the best genes than when using the whole dataset or the worst genes (Fig. half-dozen.). The situation observed in the all-time genes describes the ideal situation if nosotros wish to lessen LBA (12, 21, 22).

An external file that holds a picture, illustration, etc. Object name is abc5162-F6.jpg

Best genes take brusque final branches and longer internal branches.

(A) A tree showing the clades (names) and branches (letters) for which lengths were estimated for the Philippe data. (B) Estimates of clade and co-operative lengths for empirical data using a site-heterogeneous model for three data samples: best genes (green, highest monophyly scores), all dataset (gray), and worst genes (blackness, lowest monophyly scores). Best genes have shorter last branches inside clades than all or worst. Best genes take longer branches separating clades than all or worst. (C) A tree showing the clades (names) and branches (letters) for which lengths were estimated for the Simion data. (D) Estimates of clade and branch lengths for the Simion-all-time, Simion-all and Simion-worst genes. Best genes have shorter terminal branches within clades than all or worst. For the best genes, most internal branches are the same or longer than for all or worst genes with the exception of the internal co-operative leading to the Ctenophora clade.

To test the effects of this gene pick on our ability to infer the correct topology, we simulated site-heterogeneous information nether the Xenambulacraria and Porifera-kickoff copse using parameters derived from the worst and best genes. Trees reconstructed from the worst data using the site-heterogeneous C60 + LG + M model show a clear reduction in back up for the correct tree compared to the full dataset (Xenambulacraria: all genes, 95% correct and worst, 25% correct; Porifera-kickoff: all genes, 88% right and worst, 65% right). For these worst data, we also occasionally recovered Xenacoelomorpha as sister to Protostomia and Chordata (in 2% of the faux datasets consisting of all taxa and in 5% of the simulated datasets without acoels). For datasets simulated using parameters estimated from the best genes, the correct tree is reconstructed 100% of the fourth dimension (Xenambulacraria) and 89% of the time (Porifera-get-go). However, even under the site-homogeneous model, the correct tree is reconstructed 26% of the fourth dimension (Xenambulacraria) and 7% of the fourth dimension (Porifera-commencement). When long-branched Acoelomorpha are removed, the correct Xenambulacraria topology is reconstructed 92% of the time for simulated data based on the best genes even with a site-homogeneous model.

In contrast to these results, using data simulated according to the putative LBA topologies (Nephrozoa and Ctenophora-first), the right topology is always recovered regardless of the dataset or model used (Fig. 4). The Ctenophora-first/Nephrozoa topologies are trivial to reconstruct correctly even with poor data or under model violation.

Accuracy correlates with the complexity of site-heterogeneous models

To exam the furnishings of models of intermediate complication, nosotros repeated our analyses using a second site-heterogeneous model with fewer site frequency profiles (C10 + LG + G) on data simulated using a site-heterogeneous model under the Xenacoelomorpha hypothesis. Nosotros establish that, using this model that has a complexity (and fit) intermediate betwixt the site-homogeneous models and the relatively complex C60 model, we recovered intermediate results for data imitation under the Porifera-first and Xenambulacraria hypotheses for all datasets (all-time, all, and worst). Copse were reconstructed correctly more oftentimes than when using site-homogeneous models but less often than with C60 (table S1). For the information simulated under the alternative Ctenophora-showtime and Nephrozoa hypotheses, the correct topology was recovered in all cases.

Building on this, we repeated the experiment for a subset of datasets using the more than circuitous CAT + LG + G model using the PhyloBayes software. For all datasets simulated under the Nephrozoa (ten datasets) or Ctenophora-first topologies (10 datasets), we reconstructed the correct tree. For datasets simulated under the Xenambulacraria or Porifera-first topologies, we considered 10 datasets that C60 correctly reconstructed and 10 for which C60 incorrectly reconstructed the Ctenophora-first and Nephrozoa trees, respectively. For the former, the CAT + LG + Grand model also always reconstructed the correct tree with posterior probability (pp) = 1.0. For the latter, the Cat + LG + G model succeeded where the C60 model had failed admitting with pp <1 in some replicates, and in 2 datasets, when using CAT, the Xenacoelomorpha were recovered every bit sister to Protostomes and Chordates. Empirical data may suffer from additional heterogeneous processes (23, 24) not captured past the True cat model and therefore as well absent in our simulations. Failing to adequately model any such boosted heterogeneity tin have a similar effect to failing to model site frequency heterogeneity. Information technology should be expected, therefore, that in empirical studies, fifty-fifty the Cat model may fail to overcome LBA artifacts. These experiments propose, nevertheless, that better fitting models are amend able to overcome these LBA artefacts.

Last, we wanted to examine concerns (25) that the support for the Porifera-beginning and Xenambulacraria trees that has been observed when analyzing empirical data using the True cat-F81 model may exist the outcome of an error stemming from the radical assumptions about amino acid exchangeabilities in Cat-F81. We considered x datasets imitation using a site-heterogeneous model nether each of the Ctenophora-beginning and Nephrozoa topologies and reanalyzed these using the Cat-F81 model. We find that, with these data, the employ of the True cat-F81 model always gave the correct tree, never resulting in incorrect back up for Porifera-get-go or for Xenambulacraria.

Word

In that location have already been many analyses attempting to resolve the positions of the Ctenophora and Xenacoelomorpha (1–ix, 26, 27). These have used increasingly large datasets and various complex analyses and data filtering schemes. In each of the two cases, there are two entrenched camps and little credible progress. Implicit in the recent arguments in the scientific literature is the thought that at that place is a fine balance between the 2 proposed positions (28). One implication of this is that it will take fifty-fifty more data or more sophisticated analysis to tease out some elusive signal and to nudge opinion in i direction or another. What nosotros observe, in contrast, is that there is a major and so far underappreciated disproportion between the two possible solutions. They are non equivalent, and our results show that the resulting change to sensible prior expectations should take a major influence on interpretation of published results.

Supporters of the Porifera-first and Xenambulacraria topologies have long suggested that the alternatives, Ctenophora-first and Nephrozoa, outcome from an LBA artifact. The long branches leading to the Xenacoelomorpha and Ctenophora and the short branches relating them to other phyla suggested that their placement might exist especially susceptible to systematic errors. Our simulations, using realistic parameters drawn from empirical data, evidence only how important this artifact is.

Using realistic simulations nether the Ctenophora-first and Nephrozoa topologies, we show that we never recover the alternatives (Porifera-showtime and Xenambulacraria) in error. We show that Ctenophora-first and Nephrozoa trees, if they were true, would proceeds exaggerated artifactual support due to the furnishings of LBA. This effect is well known as the "Farris zone" (29) or "inverse Felsenstein zone" (30): the bogus reinforcement of a close relationship between two long branches by LBA. If the Ctenophora-first and Nephrozoa copse were truthful, and then there is a very low likelihood that the support that empirical studies have shown for the alternatives would ever exist observed (fourteen).

In contrast, the Porifera-first and Xenambulacraria trees are highly susceptible to LBA effects. We accept shown that the Xenambulacraria and Porifera-first copse are both strongly affected past the long terminal branches leading to the corresponding phyla, by short internal branches, and by unaccounted for site heterogeneity, making these trees hard to recover. Under conditions that emphasize LBA, nosotros very frequently recover the wrong topologies. The wrong topologies that we observe are the Ctenophora-first and Nephrozoa copse in almost every case. More than by and large, our simulations based on Porifera-first and Xenambulacraria topologies accurately predict exactly the effects of long branches and inadequate models that take been observed using real data.

Our findings propose that the Ctenophora-commencement and Nephrozoa trees are plausibly interpreted as artifacts. Support seen in empirical studies for the culling Porifera-commencement and Xenambulacraria trees are unlikely to be artefacts, and the implication is that these trees are likely to be correct. This would suggest that the most plausible sister group of all other animals is the Porifera and non the Ctenophora and that the Xenacoelomorpha is likely to be the sister group of the Ambulacraria and not a branch intermediate betwixt Cnidaria and the rest of the Bilateria.

MATERIALS AND METHODS

Experimental design

To test the effects of model misspecification on our ability to reconstruct the position of the Xenacoelomorpha and Ctenophora correctly, nosotros used empirical and simulated information. We beginning used empirical data to assess the effects of site-heterogeneous and site-homogeneous substitution models on branch length interpretation.

We then simulated data under the two models and the alien topologies based on parameters learned from the empirical data. On the ground of the simulated data, nosotros evaluated the performance of site-homogeneous and site-heterogeneous models in co-operative length estimation for both site-homogeneous and site-heterogeneous data.

We used the simulated data to judge tree topologies using different models to run into the issue of model misspecification on our ability to reconstruct the correct tree. We used unlike information samples (removing certain species or genes) to run into the effects of this on our ability to reconstruct the correct tree.

Data

For the majority of the analyses, we used two recently published phylogenomic datasets: (i) 1 focusing on the placement of Ctenophora ["Simion" (8)], which comprises 97 taxa of which 72 are metazoans and 25 are nonmetazoans, and (two) a dataset aimed at resolving the placement of Xenacoelomorpha ["Philippe" (nine)] consisting of 59 taxa of which 45 are bilaterians and 14 are outgroups.

Both datasets had been filtered for potential contaminants, paralogs, and other outlier sequences (more details are provided in the original papers). Following this filtering, the two datasets consist of 401,632 (Simion) and 353,607 (Philippe) amino acrid positions. In both cases, these datasets are impractically large for repeated phylogenetic inference, particularly in a Bayesian inference framework. To ease the computational brunt, we randomly selected, from each complete dataset, a subset of 30,000 amino acid positions for the downstream analyses. We refer to the 2 resulting datasets as Simion-all and Philippe-all.

To study the effect of unlike gene sets on the placement of Xenacoelomorpha and Ctenophora, we also created two additional subsets of 30,000 randomly selected positions per dataset. For both sets of genes (Simion and Philippe), we scored each individual gene on the basis of its ability to reconstruct uncontested clades [sensu (ix)]. We then ranked the genes on the ground of this monophyly score and concatenated genes according to their rank from best to worst. For the Simion et al. data, we considered as uncontested clades the Homoscleromorpha, Calcarea, Hexactinellida, Demospongiae, Ctenophora, Bilateria, Medusozoa, Anthozoa, and Metazoa. For the Phillipe et al. data, we assumed the aforementioned groups as in the original paper (ix). Later on concatenating the genes from best scoring to worst scoring, we randomly sampled 30,000 alignment sites from the first quarter of the alignment, i.e., the highest scoring ("Simion-all-time" and "Philippe-best"), and 30,000 from the final quarter of the alignment, i.due east., the everyman scoring ("Simion-worst" and "Philippe-worst"). Final, we examined the placement of Xenacoelomorpha with respect to two more factors: (i) the exclusion of the fast evolving Acoelomorpha and (two) the monophyly of Deuterostomia that represents the traditional view of the relationships relating the Chordata and Ambulacraria.

For the offset case, nosotros removed all acoelomorph taxa from the Philippe data and kept only slowly evolving xenoturbellid. After removing these taxa, nosotros performed the same subsampling every bit before, i.e., we randomly sampled thirty,000 amino acids from the entire alignment ("Philippe-no_acoels-all"), the best-scoring genes ("Philippe-no_acoels-best"), and the worst-scoring genes ("Philippe-no_acoels-worst") as earlier.

Overall, nosotros considered eight topologies summarized in fig. S1. Ii of these reverberate the conflicting placement of Ctenophora (topologies A and B in fig. S1). For both of these trees, the rest of the phylogeny was the same as the ane published in (8) (their "tree_97sp_CAT.tre"). For the placement of Xenacoelomorpha, there are six alternative topologies (C to H in fig. S1), for which the phylum is placed every bit sister either to Nephrozoa or to Ambulacraria. In 4 of them, nosotros causeless that deuterostomes are paraphyletic, either with all the Xenacoelomorpha included (C and D in fig. S2) or with only Xenoturbellida (East and F in fig. S1), while for two, deuterostomes were assumed monophyletic (G and H in fig. S1). For all half-dozen scenarios, the rest of the tree followed the topology published in (nine) (figure 1 in the original article). For the terminal two hypotheses (Grand and H in fig. S1), we also examined the additional dataset of Cannon et al. (Cannon), for which the phylogeny was based on the topology from (5) (figure 2 in the original report). Every bit earlier, we randomly selected thirty,000 sites from the original Cannon alignment.

Branch length estimation: Empirical data

Our start goal was to test whether the site-frequency-heterogeneous (True cat + LG + Chiliad) and site-frequency-homogeneous (LG + Thou) models yield different branch length estimates for either of the two main empirical datasets (i.east., Simion-all and Philippe-all). To achieve this, we used PhyloBayes-MPI version 1.8 (fifteen) to guess a posterior sample of the co-operative lengths for the two datasets nether the LG + Yard and the True cat + LG + One thousand models (other priors were kept to default values). Each of the four combinations (i.e., Simion-all with LG + G, Simion-all with CAT + LG + 1000, Philippe-all with LG + G, and Philippe-all with CAT + LG + One thousand) were run twice for ten,000 Markov concatenation Monte Carlo (MCMC) cycles with a sampling frequency of i. At 10,000 MCMC cycles, the 2 runs were assessed for signs of convergence [i.due east., effective sample size (ESS) > 100 for each of the runs and for the combined pairs]. The runs that had ESS values lower than 100 were run for x,000 or xx,000 boosted MCMC cycles (table S1).

Nosotros besides examined the branch length estimates for the different gene sets (i.e., Philippe-best, Philippe-worst, Simion-best, and Simion-worst) under the site-heterogeneous model using the aforementioned procedure. In all cases, we used the final 5000 posterior samples of the relevant runs and calculated two sets of specific branches in the two phylogenies: internal branches and the average branch lengths of major clades. We nerveless the branch length values using a custom Python script bachelor at https://github.com/MaxTelford/XenoCtenoSims. The distributions of all branch length estimates are provided in the form of box plots in Figs. 2 and 4.

Internal branches

For the Philippe data, the internal branches measured were those leading to each of the Xenacoelomorpha, Ambulacraria, Xenambulacraria, Chordata, Protostomia, Chordata + Protostomia, Porifera, Bilateria + Cnidaria, Cnidaria, and Bilateria. For the Simion data, internal branches were those leading to each of Cnidaria, Bilateria, Cnidaria + Bilateria, Cnidaria + Bilateria + Placozoa, Ctenophora, Cnidaria + Bilateria + Placozoa + Ctenophora, Porifera, and Metazoa.

Average lengths of the major clades

For the Philippe information, we calculated the average distance from each species to the mutual ancestor for all species of that clade inside the post-obit clades: Xenacoelomorpha, Ambulacraria, Chordata, Protostomes, Porifera, Placozoa, and Cnidaria. For the Simion data, we measured the average branch lengths within Bilateria, Cnidaria, Placozoa, Ctenophora, Porifera, and not-Metazoa.

Branch length interpretation: Faux information

The site-homogeneous and site-heterogeneous models yielded different branch length estimates, and to find which was producing the discrepancy, we performed a examination using simulated data. We used two simulated datasets (see the "Simulations" section for details) for which the truthful tree topology was Xenambulacraria (topology A in fig. S1) and the parameters and branch lengths were estimated on the basis of the Philippe-all dataset. The two datasets differed only in the substitution model used, i.e., for 1 of them, we assumed a homogeneous site frequency ("Sim-LG + M") commutation process (i.e., the LG + G model) and for the other, we assumed a heterogeneous process ("Sim-CAT + LG + Thousand").

For each of our simulated datasets, nosotros next estimated the same co-operative lengths equally before using both heterogeneous and homogeneous models with PhyloBayes using the same procedure as for the empirical data, i.e., nosotros performed four runs: (i) information, Sim-LG + M and model for inference, LG + Thousand; (ii) information, Sim-LG + One thousand and model for inference, Cat + LG + One thousand; (iii) data, Sim-CAT + LG + Thou and model for inference, LG + Thou; and (iv) data, Sim-True cat + LG + G and model for inference, Cat-LG + Chiliad.

Model fitting

To determine whether the site-homogeneous (LG + G) or site-heterogeneous (LG + Cat + Yard) model was a meliorate fit to the Simion-all and Philippe-all datasets, nosotros compared the models using cross-validation (31) as implemented in PhyloBayes-MPI. The test was performed in five steps according to the instruction transmission: (i) The original 30,000–amino acid alignment for each of the datasets was randomly subsampled to create two subsets, i.e., the training dataset (x,000 sites) and the test dataset (2000 sites). (ii) The parameters of ane of the competing models were estimated on the basis of the preparation dataset by performing 5000 MCMC steps. (three) Using the estimated model parameters and a given topology (the Porifera-start and the Xenambulacraria, correspondingly), nosotros calculated the likelihood for the exam dataset using the "readpb_mpi -cv" option available in PhyloBayes-MPI. (iv) Using the aforementioned training and test datasets, the process was repeated for the other model. (5) The model yielding the highest likelihood was considered the all-time plumbing fixtures. We repeated the exam 10 times for each dataset and pair of models, and for all repetitions, the site-heterogeneous model was found to be meliorate than the site-homogeneous model (ΔlogL = 3036.xiv ± 96 for Philippe and ΔlogL = 3956.02 ± 199 for Simion).

Simulations

Our adjacent goal was to appraise whether the differences in branch length estimates from different models and datasets could result in topological differences. The empirical data lonely make this difficult to examination, as we do not know the true phylogeny. Instead, nosotros fake data using parameters that match those measured from empirical sequences using the different topological hypotheses and models. All topological hypotheses (A to H) used for the simulations are provided in fig. S1. The simulations were performed using PhyloBayes in two steps:

ane) Initially, nosotros estimated the posteriors of branch lengths and model parameters as described in a higher place, assuming the following combinations of stock-still topology, commutation model, and alignment: (i) topology, A; data, Simion-all; and model, True cat + LG + G; (two) topology, A; information, Simion-all-time; and model, Cat + LG + G; (iii) topology, A; data, Simion-worst; and model, CAT + LG + G; (iv) topology, B; information, Simion-all; and model, CAT + LG + G; (5) topology, B; data, Simion-best; and model, True cat + LG + Yard; (six) topology, B; data, Simion-worst; and model, CAT + LG + G; (vii) topology, A; data, Simion-all; and model, LG + G; (viii) topology, B; data, Simion-all; and model, LG + K; (9) topology, C; data, Philippe-all; and model, Cat + LG + G; (x) topology, C; data, Philippe-all-time; and model, CAT + LG + G; (eleven) topology, C; data, Philippe-worst; and model, CAT + LG + G; (xii) topology, D; information, Philippe-all; and model, CAT + LG + One thousand; (xiii) topology, D; data, Philippe-best; and model, CAT + LG + Yard; (xiv) topology, D; data, Philippe-worst; and model, Cat + LG + G; (xv) topology, C; data, Philippe-all; and model, LG + G; (xvi) topology, D; data, Philippe-all; and model, LG + G; (xvii) topology, Eastward; information, Philippe-all (no acoels); and model, CAT + LG + K; (xviii) topology, E; data, Philippe-best (no acoels); and model, CAT + LG + M; (19) topology, E; data, Philippe-worst (no acoels); and model, True cat + LG + G; (xx) topology, F; data, Philippe-all (no acoels); and model, CAT + LG + Grand; (xxi) topology, Grand; data, Philippe-all; and model, Cat + LG + G; (xxii) topology, H; information, Philippe-all; and model, CAT + LG + G; (xxiii) topology, G; data, Cannon-all; and model, Cat + LG + Thou; and (xxiv) topology, H; data, Cannon-all; and model, Cat + LG + G. The total number of MCMC cycles required for each combination to attain convergence is provided in table S1.

2) Using the final 5000 posterior samples, nosotros subsampled with a frequency of one in 500, which gave u.s. a subset of 100 posterior samples. Using these combinations of branch lengths and model parameters, we simulated information with the "readpb_mpi" tool under the "ppred" option.

For each of the imitation datasets, we inferred the phylogenetic relationships using both a site-frequency-homogeneous and a heterogeneous model (meet beneath) to determine whether using an approximately correct versus a misspecified model results in the recovery of the correct or the alien topology in each of the false scenarios. Given the large number of simulated datasets (i.east., 2400 in total), it would exist challenging to infer all the intended phylogenetic inferences (i.east., 12,000) in a Bayesian context, particularly under the CAT model. As a more practical culling, we chose a maximum likelihood approach and an approximation of the site-heterogeneous model. Nosotros used IQ-TREE version 1.6.eleven (16) to infer the phylogeny nether LG + G (with empirical land frequencies) as the site-frequency-homogeneous model, while for the heterogeneous model, we used the C60 + LG + Thousand + F [Le et al. (17)]. The C60 model has sixty categories of sites as opposed to the (potentially) infinite sites of the CAT model; however, this simplified model constitutes a skillful approximation to a total site-heterogeneous model (17, 32) and one that is fast enough (33) to procedure hundreds of simulation replicates. Nosotros used the posterior mean site frequency (pmsf) approximation (33), as suggested by the IQ-TREE transmission, which requires an input tree for the calculation of the weights for each of the frequency vectors, for which we used the phylogeny estimated by the LG + G model.

For a subset of the simulated information, we performed bootstrap analyses to evaluate the strength of back up for the inferred topology under both the site-homogeneous and site-heterogeneous model. Specifically, we performed the analyses for the datasets faux under the Porifera-first and Ctenophora-kickoff hypotheses with the Simion-all dataset every bit well as under the Xenambulacraria and Nephrozoa hypotheses with the Philippe-all dataset. For each dataset, we performed 1000 ultrafast bootstrap replicates with IQ-TREE (34). We did two bootstrap runs: once under the site-homogeneous model and one time under the site-heterogeneous model. The analyses were performed with the "-wbt" option that stores the bootstrap trees. Afterward, using the resulting bootstrap files, we calculated the frequency of occurrence of the conflicting splits for each dataset, i.east., in the Simion-all datasets, we searched for the splits "Ctenophora, Outgroup | Porifera, Remaining Metazoa" and "Porifera, Outgroup | Ctenophora, Remaining Metazoa," and in the Philippe-all datasets, we searched for the "Xenacoelomorpha, Outgroup | Ambulacraria, Remaining Bilateria" and the "Outgroup, Remaining Bilateria | Xenacoelomorpha, Ambulacraria" splits.

Final, we performed three farther tests. Commencement, we tested whether the computationally faster site-heterogeneous model "C10" (with 10 singled-out categories of sites) (17) would perform similarly to the more than complex "C60" (with threescore distinct categories of sites). We used the data simulated under site-heterogeneous models and (i) the Porifera-offset and the Ctenophora-first hypotheses (topologies A and B; fig. S1) with the Simion-best, Simion-all, and Simion-worst and (ii) the Xenambulacraria and Nephrozoa hypotheses (topologies C and D; fig. S1) with the Philippe-all-time, Philippe-all, and Philippe-worst. 2nd, to assess whether PhyloBayes under the Cat model produces similar results, we used the CAT-LG model to infer the topology of xx datasets simulated nether the Porifera-first, 10 under the Ctenophora-first, 20 nether the Xenambulacraria, and 10 under the Nephrozoa topologies. For the Porifera-first and the Xenambulacraria datasets, we selected 2 sets of simulated data, x that recovered the true topology under the C60 model and 10 that recovered the wrong topology (i.eastward., the Ctenophora-kickoff and the Nephrozoa). Last, we used the CAT-F81 model to infer the phylogeny of ten datasets imitation nether the Ctenophora-first hypothesis and 10 datasets imitation under the Nephrozoa hypothesis. These tests would aid us assess whether the simplistic but unremarkably used Poisson model could cause the erroneous recovery of the Xenambulacraria or Porifera-outset topologies. All the datasets used for the True cat inferences (either CAT-LG or CAT-F81) were fake using parameters learned from the worst data, which were the near challenging ones. The resulting topologies for each set of simulations were summarized into a consensus tree using RaxML (35) and are provided in https://github.com/MaxTelford/XenoCtenoSims.

Composite dataset analyses

To judge the strength of the bias toward reconstructing the Nephrozoa versus Xenambulacraria topologies, we synthesized datasets with increasing proportions of positions derived from simulations based on the Xenambulacraria tree compared to Nephrozoa. We created xx pairs of simulated datasets; in each pair, one dataset was simulated under the Xenambulacraria hypothesis and the other nether the Nephrozoa. For each pair, we combined the two datasets into a composite alignment with ix unlike proportions: 10% Xenambulacraria–ninety% Nephrozoa, 20% Xenambulacraria–fourscore% Nephrozoa, 30% Xenambulacraria–70% Nephrozoa, xl% Xenambulacraria–lx% Nephrozoa, 50% Xenambulacraria–l% Nephrozoa, 60% Xenambulacraria–40% Nephrozoa, seventy% Xenambulacraria–30% Nephrozoa, eighty% Xenambulacraria–xx% Nephrozoa, and xc% Xenambulacraria–x% Nephrozoa.

We followed this procedure for two cases: in one case bold deuterostomes beingness monophyletic and once assuming deuterostomes to be paraphyletic. Overall, this gave us 180 composite alignments, and for each of them, we inferred the phylogenetic relationships using IQ-TREE under the site-heterogeneous model C60 + LG + G + F (tabular array S2). The inference was performed with the pmsf approximation equally described before.

Supplementary Material

http://advances.sciencemag.org/cgi/content/full/6/50/eabc5162/DC1:

Adobe PDF - abc5162_SM.pdf:

Topology-dependent asymmetry in systematic errors affects phylogenetic placement of Ctenophora and Xenacoelomorpha:

Acknowledgments

We are grateful to Tomáš Flouri and Ziheng Yang for discussions and suggestions for the comeback of the manuscript and to Nicolas Lartillot for very helpful feedback and advice. Funding: This work was funded by BBSRC grant BB/R016240/one. Author contributions: Initial concept: One thousand.J.T. and P.1000. Analyses: P.K. Initial typhoon of manuscript: M.J.T. Figures: P.Thousand. Last draft of manuscript: M.J.T. and P.K. Competing interests: The authors declare that they accept no competing interests. Information and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The simulation results and a Python script that was used for parsing the Newick copse are available in the GitHub repository: https://github.com/MaxTelford/XenoCtenoSims. Additional data related to this paper may be requested from the authors.

SUPPLEMENTARY MATERIALS

REFERENCES AND NOTES

ane. Dunn C. W., Hejnol A., Matus D. Q., Pang K., Browne W. E., Smith South. A., Seaver E., Rouse G. W., Obst G., Edgecombe G. D., Sørensen K. Five., Haddock S. H. D., Schmidt-Rhaesa A., Okusu A., Kristensen R. M., Wheeler W. C., Martindale Grand. Q., Giribet 1000., Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452, 745–749 (2008). [PubMed] [Google Scholar]

two. Hejnol A., Obst M., Stamatakis A., Ott M., Rouse Thou. W., Edgecombe G. D., Martinez P., Baguñà J., Bailly X., Jondelius U., Wiens G., Müller W. E. G., Seaver E., Wheeler W. C., Martindale M. Q., Giribet G., Dunn C. W., Assessing the root of bilaterian animals with scalable phylogenomic methods. Proc. R. Soc. B Biol. Sci. 276, 4261–4270 (2009). [PMC gratuitous commodity] [PubMed] [Google Scholar]

3. Whelan N. V., Kocot Grand. M., Moroz L. Fifty., Halanych Thou. Thousand., Mistake, indicate, and the placement of Ctenophora sister to all other animals. Proc. Natl. Acad. Sci. The statesA. 112, 5773–5778 (2015). [PMC costless article] [PubMed] [Google Scholar]

4. Ryan J. F., Pang K., Schnitzler C. E., Nguyen A. D., Moreland R. T., Simmons D. K., Koch B. J., Francis West. R., Havlak P., Smith S. A., Putnam N. H., Haddock S. H. D., Dunn C. W., Wolfsberg T. G., Mullikin J. C., Martindale Thousand. Q., Baxevanis A. D., The genome of the ctenophore Mnemiopsis leidyi and its implications for cell blazon evolution. Scientific discipline 342, 1242592 (2013). [PMC free article] [PubMed] [Google Scholar]

5. Cannon J. T., Vellutini B. C., Smith J., Ronquist F., Jondelius U., Hejnol A., Xenacoelomorpha is the sister grouping to Nephrozoa. Nature 530, 89–93 (2016). [PubMed] [Google Scholar]

6. Rouse Thousand. W., Wilson N. G., Carvajal J. I., Vrijenhoek R. C., New abyssal species of Xenoturbella and the position of xenacoelomorpha. Nature 530, 94–97 (2016). [PubMed] [Google Scholar]

seven. Pisani D., Pett West., Dohrmann M., Feuda R., Rota-Stabelli O., Philippe H., Lartillot N., Wörheide Thousand., Genomic information do not back up comb jellies as the sister group to all other animals. Proc. Natl. Acad. Sci. U.S.A. 112, 15402–15407 (2015). [PMC free article] [PubMed] [Google Scholar]

viii. Simion P., Philippe H., Baurain D., Jager Chiliad., Richter D. J., Di Franco A., Roure B., Satoh Northward., Quéinnec É., Ereskovsky A., Lapébie P., Corre E., Delsuc F., Rex North., Wörheide Yard., Manuel Yard., A large and consistent phylogenomic dataset supports sponges as the sister group to all other animals. Curr. Biol. 27, 958–967 (2017). [PubMed] [Google Scholar]

9. Philippe H., Poustka A. J., Chiodin M., Hoff K. J., Dessimoz C., Tomiczek B., Schiffer P. H., Müller Due south., Domman D., Horn M., Kuhl H., Timmermann B., Satoh Due north., Hikosaka-Katayama T., Nakano H., Rowe M. L., Elphick M. R., Thomas-Chollier M., Hankeln T., Mertes F., Wallberg A., Rast J. P., Copley R. R., Martinez P., Telford M. J., Mitigating anticipated effects of systematic errors supports sister-group relationship between Xenacoelomorpha and Ambulacraria. Curr. Biol. 29, 1818–1826.e6 (2019). [PubMed] [Google Scholar]

ten. Felsenstein J., Cases in which parsimony or compatibility methods will be positively misleading. Syst. Biol. 27, 401–410 (1978). [Google Scholar]

11. Yang Z., Evaluation of several methods for estimating phylogenetic trees when substitution rates differ over nucleotide sites. J. Mol. Evol. xl, 689–697 (1995). [Google Scholar]

12. Huelsenbeck J. P., Performance of phylogenetic methods in simulation. Syst. Biol. 44, 17–48 (1995). [Google Scholar]

13. Lartillot N., Brinkmann H., Philippe H., Suppression of long-co-operative attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7, S4 (2007). [PMC costless article] [PubMed] [Google Scholar]

14. Kapli P., Yang Z., Telford M. J., Phylogenetic tree building in the genomic age. Nat. Rev. Genet. 21, 428–444 (2020). [PubMed] [Google Scholar]

fifteen. Lartillot N., Rodrigue Northward., Stubbs D., Richer J., Phylobayes MPI: Phylogenetic reconstruction with infinite mixtures of profiles in a parallel surround. Syst. Biol. 62, 611–615 (2013). [PubMed] [Google Scholar]

16. Nguyen L. T., Schmidt H. A., Von Haeseler A., Minh B. Q., IQ-TREE: A fast and constructive stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015). [PMC gratuitous article] [PubMed] [Google Scholar]

17. Le S. Q., Gascuel O., Lartillot N., Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008). [PubMed] [Google Scholar]

xviii. Aguinaldo A. G. A., Turbeville J. M., Linford L. S., Rivera Thou. C., Garey J. R., Raff R. A., Lake J. A., Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387, 489–493 (1997). [PubMed] [Google Scholar]

19. Kim J., Kim West., Cunningham C. W., A new perspective on lower metazoan relationships from 18S rDNA sequences. Mol. Biol. Evol. 16, 423–427 (1999). [PubMed] [Google Scholar]

20. Bergsten J., A review of long-co-operative allure. Cladistics 21, 163–193 (2005). [Google Scholar]

21. Huelsenbeck J. P., Hillis D. Thousand., Success of phylogenetic methods in the iv taxon case. Syst. Biol. 42, 247–264 (1993). [Google Scholar]

22. Hillis D. One thousand., Huelsenbeck J. P., Cunningham C. West., Application and accuracy of molecular phylogenies. Science 264, 671–677 (1994). [PubMed] [Google Scholar]

23. Roure B., Philippe H., Site-specific time heterogeneity of the substitution procedure and its impact on phylogenetic inference. BMC Evol. Biol. 11, 17 (2011). [PMC gratis commodity] [PubMed] [Google Scholar]

24. Zhengting Z., Jianzhi Z., Amino acid exchangeabilities vary across the tree of life. Sci. Adv. 5, eaax3124 (2019). [PMC free commodity] [PubMed] [Google Scholar]

25. Whelan North. 5., Halanych G. M., Who let the CAT out of the purse? Accurately dealing with substitutional heterogeneity in phylogenomic analyses. Syst. Biol. 66, 232–255 (2017). [PubMed] [Google Scholar]

26. Bourlat S. J., Juliusdottir T., Lowe C. J., Freeman R., Aronowicz J., Kirschner M., Lander E. S., Thorndyke M., Nakano H., Kohn A. B., Heyland A., Moroz Fifty. L., Copley R. R., Telford M. J., Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida. Nature 444, 85–88 (2006). [PubMed] [Google Scholar]

27. Philippe H., Brinkmann H., Copley R. R., Moroz L. L., Nakano H., Poustka A. J., Wallberg A., Peterson K. J., Telford One thousand. J., Acoelomorph flatworms are deuterostomes related to Xenoturbella . Nature 470, 255–258 (2011). [PMC free commodity] [PubMed] [Google Scholar]

28. Male monarch N., Rokas A., Embracing uncertainty in reconstructing early animal development. Curr. Biol. 27, R1081–R1088 (2017). [PMC free article] [PubMed] [Google Scholar]

29. Siddall M. E., Success of parsimony in the four-taxon case: Long-branch repulsion by likelihood in the Farris zone. Cladistics 14, 209–220 (1998). [Google Scholar]

30. Swofford D. Fifty., Waddell P. J., Huelsenbeck J. P., Foster P. G., Lewis P. O., Rogers J. S., Bias in phylogenetic estimation and its relevance to the pick between parsimony and likelihood methods. Syst. Biol. 50, 525–539 (2001). [PubMed] [Google Scholar]

31. Stone M., Cantankerous-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Ser. B. 36, 111–133 (1974). [Google Scholar]

32. Wang H. C., Li K., Susko E., Roger A. J., A class frequency mixture model that adjusts for site-specific amino acrid frequencies and improves inference of protein phylogeny. BMC Evol. Biol. eight, 331 (2008). [PMC free commodity] [PubMed] [Google Scholar]

33. Wang H. C., Minh B. Q., Susko E., Roger A. J., Modeling site heterogeneity with posterior mean site frequency profiles accelerates authentic phylogenomic estimation. Syst. Biol. 67, 216–235 (2018). [PubMed] [Google Scholar]

34. Hoang D. T., Chernomor O., Von Haeseler A., Minh B. Q., Vinh L. South., UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018). [PMC costless article] [PubMed] [Google Scholar]

35. Stamatakis A., RAxML version 8: A tool for phylogenetic analysis and postal service-assay of large phylogenies. Bioinformatics xxx, 1312–1313 (2014). [PMC free article] [PubMed] [Google Scholar]

Manufactures from Science Advances are provided here courtesy of American Association for the Advocacy of Science

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7732190/

Posted by: elkinsextur1962.blogspot.com