Skip to content

Commit bb22ba7

Browse files
committed
Improve documentation
1 parent 8071cd3 commit bb22ba7

File tree

9 files changed

+525
-3
lines changed

9 files changed

+525
-3
lines changed

README.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,13 @@
77
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/torchtree)
88

99

10-
torchtree is a program designed for inferring phylogenetic trees from molecular sequences. Implemented in Python, it leverages [PyTorch] for automatic differentiation. The suite of inference algorithms encompasses variational inference, Hamiltonian Monte Carlo, maximum *a posteriori*, and Markov chain Monte Carlo.
10+
torchtree is a program designed for developing and inferring phylogenetic models. Implemented in Python, it leverages [PyTorch] for automatic differentiation. The suite of inference algorithms encompasses variational inference, Hamiltonian Monte Carlo, maximum *a posteriori*, and Markov chain Monte Carlo.
1111

1212
- [Getting Started](#getting-started)
1313
- [Dependencies](#dependencies)
1414
- [Installation](#installation)
1515
- [Quick start](#quick-start)
16+
- [Documentatoin](#documentation)
1617
- [Plug-ins](#torchtree-plug-in)
1718

1819
## Getting Started
@@ -44,6 +45,11 @@ Check install
4445
torchtree --help
4546
```
4647

48+
## Documentation
49+
For detailed information on how to use `torchtree` and its features, please refer to the official documentation and API reference.
50+
- [Documentation](https://4ment.github.io/torchtree)
51+
- [API Reference](https://4ment.github.io/torchtree/autoapi/torchtree/index.html)
52+
4753
## Quick start
4854
`torchtree` requires a JSON file containing models and algorithms. A configuration file can be generated using `torchtree-cli`, a command line-based tool. This two-step process allows the user to adjust values in the configuration file, such as hyperparameters.
4955

@@ -69,12 +75,15 @@ torchtree fluA.json
6975
```
7076

7177
## torchtree plug-in
72-
torchtree can be easily extended without modifying the code base thanks its modular implementation. Some examples of external packages
78+
torchtree can be easily extended without modifying the code base thanks its modular implementation. Some examples of plug-ins:
79+
7380
- [torchtree-bito]
7481
- [torchtree-physher]
7582
- [torchtree-scipy]
7683
- [torchtree-tensorflow]
7784

85+
A GitHub [template](https://github.com/4ment/torchtree-plugin-template) is available to assist in the development of a plug-in, and it is highly recommended to use it. This template provides a structured starting point, ensuring consistency and compatibility with `torchtree` while streamlining the development process.
86+
7887
## License
7988

8089
Distributed under the GPLv3 License. See [LICENSE](LICENSE) for more information.

docs/_static/custom.css

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
/* Custom color for inline code with the "keycode" role */
2+
.keycode {
3+
color: #005b82; /* Change this to your desired color */
4+
background-color: #f0f8ff; /* Optional: Add background color */
5+
}

docs/advanced/concepts.rst

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
Building blocks
2+
===============
3+
4+
Parameter object
5+
----------------
6+
:py:class:`~torchtree.core.parameter.Parameter` objects play a central role in torchtree. They are used to define the parameters of models and distributions, and are involved in any kind of optimization or inference.
7+
A Parameter object contains a reference to a pytorch tensor object which can be accessed using the :py:attr:`~torchtree.core.parameter.Parameter.tensor` property.
8+
There are different ways to define a Parameter object in a JSON file. The most common way is to define a :keycode:`tensor` key associated with a list of real numbers as shown below:
9+
10+
.. code-block:: JSON
11+
12+
{
13+
"id": "gtr_frequencies",
14+
"type": "Parameter",
15+
"tensor": [0.25, 0.25, 0.25, 0.25]
16+
}
17+
18+
Inside torchtree, this JSON object will be converted to a python object:
19+
20+
.. code-block:: python
21+
22+
Parameter("gtr_frequencies", torch.tensor([0.25, 0.25, 0.25, 0.25]))
23+
24+
Another way to define the same object using a different initialization method is:
25+
26+
.. code-block:: JSON
27+
28+
{
29+
"id": "gtr_frequencies",
30+
"type": "Parameter",
31+
"full": [4],
32+
"value": 0.25
33+
}
34+
35+
which in python will be converted to:
36+
37+
.. code-block:: python
38+
:linenos:
39+
40+
Parameter("gtr_frequencies", torch.full([4], 0.25))
41+
42+
43+
TransformedParameter object
44+
---------------------------
45+
In torchtree, :py:class:`~torchtree.core.parameter.Parameter` objects are typically considered to be unconstrained.
46+
Optimizers (such as those used in ADVI and MAP) and samplers (e.g. HMC) will change the value of the tensor they encapsulate without checking if the new value is within the parameter's domain.
47+
However, in many cases, phylogenetic models contain constrained parameters such as branch lengths (positive real numbers) or equilibrium base frequencies (positive real numbers that sum to 1).
48+
For example, the GTR model expects the equilibrium base frequencies to be positive real numbers that sum to 1 and a standard optimizer will ignore such constraints.
49+
:py:class:`~torchtree.core.parameter.TransformedParameter` objects allow moving from unconstrained to constrained spaces using `transform <https://pytorch.org/docs/stable/distributions.html#torch.distributions.transforms.Transform>`_ objects available in pytorch.
50+
51+
We can replace the JSON object defining the GTR equilibrium base frequencies with a TransformedParameter object as shown below:
52+
53+
.. code-block:: JSON
54+
55+
{
56+
"id": "gtr_frequencies",
57+
"type": "TransformedParameter",
58+
"transform": "torch.distributions.StickBreakingTransform",
59+
"x":{
60+
"id": "gtr_frequencies_unconstrained",
61+
"type": "TransformedParameter",
62+
"type": "Parameter",
63+
"zeros": [3]
64+
}
65+
}
66+
67+
This is equivalent to the following python code:
68+
69+
.. code-block:: python
70+
71+
import torch
72+
from torchtree import Parameter, TransformedParameter
73+
74+
unconstrained = Parameter("gtr_frequencies_unconstrained", torch.zeros([3]))
75+
transform = torch.distributions.StickBreakingTransform()
76+
constrained = TransformedParameter("gtr_frequencies", unconstrained, transform)
77+
78+
An optimizer will change the value of the **gtr_frequencies_unconstrained** Parameter object and the **gtr_frequencies** (transformed) parameter will apply the StickBreakingTransform transform to the value of **gtr_frequencies_unconstrained** to update the transition rate matrix.
79+
80+
In this example, we are using the `StickBreakingTransform <https://pytorch.org/docs/stable/distributions.html#torch.distributions.transforms.StickBreakingTransform>`_ object that will transform the unconstrained parameter **gtr_frequencies_unconstrained** to a constrained parameter **gtr_frequencies**.
81+
Note the value of the :keycode:`transform` key is a string containing the full path to the pytorch class that implements the transformation.
82+
Specifically, ``torch`` is the package name, ``distributions`` is the module name, and ``StickBreakingTransform`` is the class name.
83+
84+
85+
Models and CallableModels
86+
-------------------------
87+
Virtually every torchtree object that does some kind of computations inherits from the :py:class:`~torchtree.core.model.Model` class.
88+
Computations can involve Parameter and/or other Model objects.
89+
The Distribution class we described earlier is derived from the class Model since it defines a probability distribution and return a log probability.
90+
The GTR substitution model is also a Model object since its role is to calculate a transition probability matrix.
91+
92+
A model that returns a value when called is said to be *callable* and it extends the :py:class:`~torchtree.core.model.CallableModel` abstract class.
93+
A distribution is a callable model since it returns the log probability of a sample.
94+
The class representing a tree likelihood model is also callable since it calculates the log likelihood and we will describe it further in the next section.
95+

docs/advanced/tree_likelihood.rst

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
Tree likelihood model
2+
=====================
3+
4+
**torchtree** is designed to infer parameters of phylogenetic models, with every analysis containing at least one tree likelihood object.
5+
This object is responsible for calculating the probability of an alignment given a tree and its associated parameters.
6+
Below, we describe the structure of a tree likelihood object through its JSON representation.
7+
8+
.. code-block:: JSON
9+
:linenos:
10+
11+
[
12+
{
13+
"id": "taxa",
14+
"type": "Taxa",
15+
"taxa": [
16+
{
17+
"id": "A",
18+
"type": "Taxon"
19+
},
20+
{
21+
"id": "B",
22+
"type": "Taxon"
23+
},
24+
{
25+
"id": "C",
26+
"type": "Taxon"
27+
}
28+
]
29+
},
30+
{
31+
"id": "alignment",
32+
"type": "Alignment",
33+
"datatype": {
34+
"id": "data_type",
35+
"type": "NucleotideDataType"
36+
},
37+
"taxa": "taxa",
38+
"sequences": [
39+
{
40+
"taxon": "A",
41+
"sequence": "ACGT"
42+
},
43+
{
44+
"taxon": "B",
45+
"sequence": "AATT"
46+
},
47+
{
48+
"taxon": "C",
49+
"sequence": "ACTT"
50+
}
51+
]
52+
},
53+
{
54+
"id": "like",
55+
"type": "TreeLikelihoodModel",
56+
"tree_model": {
57+
"id": "tree",
58+
"type": "UnrootedTreeModel",
59+
"newick": "((A:0.1,B:0.2):0.3,C:0.4);",
60+
"branch_lengths": {
61+
"id": "branch_lengths",
62+
"type": "Parameter",
63+
"tensor": 0.1,
64+
"full": [
65+
3
66+
]
67+
},
68+
"site_model": {
69+
"id": "sitemodel",
70+
"type": "ConstantSiteModel"
71+
},
72+
"substitution_model": {
73+
"id": "substmodel",
74+
"type": "GTR",
75+
"rates": {
76+
"id": "gtr_rates",
77+
"type": "Parameter",
78+
"tensor": 0.16666,
79+
"full": [
80+
6
81+
]
82+
},
83+
"frequencies": {
84+
"id": "gtr_frequencies",
85+
"type": "Parameter",
86+
"full": 0.25,
87+
"tensor": [
88+
4
89+
]
90+
}
91+
},
92+
"site_pattern": {
93+
"id": "patterns",
94+
"type": "SitePattern",
95+
"alignment": "alignment"
96+
}
97+
}
98+
}
99+
]
100+
101+
The first object with type ``Taxa`` defines the taxa in the alignment. Each taxon is defined by an object with type ``Taxon`` and it might contain additional information such sampling date and geographic location.
102+
The second object is an alignment object with type :py:class:`~torchtree.evolution.alignment.Alignment` which contains the sequences of the taxa defined in the previous object.
103+
The third object is a tree likelihood model with type :py:class:`~torchtree.evolution.tree_likelihood.TreeLikelihoodModel` and is composed of four sub-models:
104+
105+
* :keycode:`tree_model`: A tree model extending the :py:class:`~torchtree.evolution.tree_model.TreeModel` class which contains the tree topology and its associated parameters.
106+
* :keycode:`site_model`: A site model extending the :py:class:`~torchtree.evolution.site_model.SiteModel` class which contains rate heterogeneity parameters across sites, if any.
107+
* :keycode:`substitution_model`: A substitution model extending the :py:class:`~torchtree.evolution.substitution_model.abstract.SubstitutionModel` class which contains the paramteres that parameterize the substitution process.
108+
* :keycode:`site_pattern`: A site pattern model extending the :py:class:`~torchtree.evolution.site_pattern.SitePattern` class which contains the compressed alignment defined in the alignment object.
109+
110+
An optional sub-model extending the :py:class:`~torchtree.evolution.branch_model.BranchModel` class can be added to the tree likelihood model to model the rate of evolution across branches using the :keycode:`branch_model` key.
111+
112+
In the JSON object above, we have specified a tree likelihood model for an unrooted tree with a GTR substitution model and equal rate across sites.
113+
114+
This modular design allows the definition of different tree likelihood models using different combinations of the sub-models.
115+
116+
For example if we wanted to define a tree likelihood model with a proportion of invariant sites we would change the value of the :keycode:`site_model` key to:
117+
118+
.. code-block:: JSON
119+
120+
{
121+
"id": "sitemodel",
122+
"type": "InvariantSiteModel",
123+
"invariant": {
124+
"id": "proportion",
125+
"type": "Parameter",
126+
"tensor": 0.5
127+
}
128+
}

docs/conf.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313
import os
1414
import sys
1515
from datetime import date
16+
from docutils import nodes
17+
from docutils.parsers.rst import roles
1618

1719
sys.path.insert(0, os.path.abspath('..'))
1820

@@ -90,3 +92,14 @@
9092
"python": ("https://docs.python.org/3/", None),
9193
"torch": ("https://pytorch.org/docs/master/", None),
9294
}
95+
96+
def colorcode_role(name, rawtext, text, lineno, inliner, options={}, content=[]):
97+
# Create a literal node with the class "keycode"
98+
node = nodes.literal(text, text, classes=["keycode"])
99+
return [node], []
100+
101+
# Register the new role
102+
roles.register_local_role('keycode', colorcode_role)
103+
104+
def setup(app):
105+
app.add_css_file('custom.css')

docs/getting_started/install.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
Installation
2+
============
3+
4+
Installing the latest stable version
5+
------------------------------------
6+
7+
To install the latest stable version of **torchtree**, run the following command:
8+
9+
.. code-block:: bash
10+
11+
pip install torchtree
12+
13+
14+
Building torchtree from source
15+
------------------------------
16+
17+
If you'd like to build **torchtree** from source, follow these steps:
18+
19+
.. code-block:: bash
20+
21+
git clone https://github.com/4ment/torchtree
22+
pip install torchtree/
23+
24+
25+
Programs Installed
26+
------------------
27+
28+
By following either installation method, the following two programs will be installed:
29+
30+
* :command:`torchtree-cli`: A command-line interface for generating JSON configuration files for your analyses.
31+
* :command:`torchtree`: The main program that runs inference algorithms using the provided JSON configuration file.
32+
33+
34+
To verify the installation or explore available options, you can use the following commands:
35+
36+
.. code-block:: bash
37+
38+
torchtree --help
39+
torchtree-cli --help
40+
41+
These commands will display usage information and available options for both tools.

0 commit comments

Comments
 (0)