|
| 1 | +Benchmark |
| 2 | +========= |
| 3 | + |
| 4 | +The `torchtree` software was evaluated alongside other phylogenetic tools in a published `benchmark study <https://github.com/4ment/gradient-benchmark>`_ [#Fourment2022]_. |
| 5 | + |
| 6 | +This benchmark assesses the memory usage and speed of various gradient implementations for phylogenetic models, including tree likelihood and coalescent models. |
| 7 | +The study's aim was to compare the efficiency of automatic differentiation (AD) and analytic gradient methods. |
| 8 | +The gradient of the tree likelihood can be computed by `BITO <https://github.com/phylovi/bito>`_ or `physher <https://github.com/4ment/physher>`_ [#Fourment2014]_, efficient C++ and C libraries that analytically calculate the gradient. |
| 9 | +`torchtree` integrates with these libraries through the `torchtree-bito <https://github.com/4ment/torchtree-bito>`_ and `torchtree-physher <https://github.com/4ment/torchtree-physher>`_ plug-ins. |
| 10 | + |
| 11 | +.. list-table:: Programs compared in the benchmark |
| 12 | + :header-rows: 1 |
| 13 | + |
| 14 | + * - Program |
| 15 | + - Language |
| 16 | + - Framework |
| 17 | + - Gradient |
| 18 | + - Libraries |
| 19 | + * - `physher <https://github.com/4ment/physher>`_ |
| 20 | + - C |
| 21 | + - |
| 22 | + - analytic |
| 23 | + - |
| 24 | + * - `phylostan <https://github.com/4ment/phylostan>`_ [#Fourment2019]_ |
| 25 | + - Stan |
| 26 | + - Stan |
| 27 | + - AD |
| 28 | + - |
| 29 | + * - `phylojax <https://github.com/4ment/phylojax>`_ |
| 30 | + - python |
| 31 | + - JAX |
| 32 | + - AD |
| 33 | + - |
| 34 | + * - `torchtree <https://github.com/4ment/torchtree>`_ |
| 35 | + - python |
| 36 | + - PyTorch |
| 37 | + - AD |
| 38 | + - BITO and physher |
| 39 | + * - `treeflow <https://github.com/christiaanjs/treeflow>`_ |
| 40 | + - python |
| 41 | + - TensorFlow |
| 42 | + - AD |
| 43 | + - |
| 44 | + |
| 45 | +In this study, we compared six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. |
| 46 | +The data consisted of a collection of influenza A datasets ranging from 20 to 2000 sequences sampled from 2011 to 2013. |
| 47 | + |
| 48 | +This macrobenchmark simulates the core steps of a real phylogenetic inference algorithm but simplifies the model to make it easier to implement across different frameworks. |
| 49 | +In this setup, we are estimating parameters of time tree under a strict clock with a constant-size coalescent model. |
| 50 | +Each implementation relies on automatic differentiation variational inference (ADVI) to maximize the evidence lower bound (ELBO) over 5,000 iterations. |
| 51 | +We specify an exponential prior (mean = 0.001) on the substitution rate and the Jeffrey's prior for the unknown population size. |
| 52 | + |
| 53 | +.. figure:: images/benchmark-macro-time.png |
| 54 | + :align: center |
| 55 | + :alt: Speed of implementations for 5,000 iterations of variational time-tree inference with a strict clock |
| 56 | + |
| 57 | + Speed of implementations for 5,000 iterations of variational time-tree inference with a strict clock. |
| 58 | + |
| 59 | +As shown in the next figure, the relative performance of AD depends on the task. |
| 60 | + |
| 61 | +.. figure:: images/benchmark-micro-time.png |
| 62 | + :align: center |
| 63 | + :alt: Speed of implementations for the gradient of various tasks needed for inference |
| 64 | + |
| 65 | + Speed of implementations for the gradient of various tasks needed for inference. |
| 66 | + |
| 67 | +.. [#Fourment2022] Fourment M, Swanepoel CJ, Galloway JG, Ji X, Gangavarapu K, Suchard MA, Matsen IV FA. Automatic differentiation is no panacea for phylogenetic gradient computation. *Genome Biology and Evolution*, 2023. doi:`10.1093/gbe/evad099 <https://doi.org/10.1093/gbe/evad099>`_ `arXiv:2211.02168 <https://arxiv.org/abs/2211.02168>`_ |
| 68 | +
|
| 69 | +.. [#Fourment2014] Fourment M and Holmes EC. Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data. *BMC Evolutionary Biology*, 2014. doi:`10.1186/s12862-014-0163-6 <https://doi.org/10.1186/s12862-014-0163-6>`_ |
| 70 | +
|
| 71 | +.. [#Fourment2019] Fourment M and Darling AE. Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics. *PeerJ*, 2019. doi:`10.7717/peerj.8272 <https://doi.org/10.7717/peerj.8272>`_ |
0 commit comments