Improve documentation

4ment · 4ment · commit bb22ba7e89c4 · 2024-10-02T10:50:23.000+10:00
diff --git a/README.md b/README.md
@@ -7,12 +7,13 @@
 ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/torchtree)
 
 
-torchtree is a program designed for inferring phylogenetic trees from molecular sequences. Implemented in Python, it leverages [PyTorch] for automatic differentiation. The suite of inference algorithms encompasses variational inference, Hamiltonian Monte Carlo, maximum *a posteriori*, and Markov chain Monte Carlo.
+torchtree is a program designed for developing and inferring phylogenetic models. Implemented in Python, it leverages [PyTorch] for automatic differentiation. The suite of inference algorithms encompasses variational inference, Hamiltonian Monte Carlo, maximum *a posteriori*, and Markov chain Monte Carlo.
 
 - [Getting Started](#getting-started)
   - [Dependencies](#dependencies)
   - [Installation](#installation)
 - [Quick start](#quick-start)
+- [Documentatoin](#documentation)
 - [Plug-ins](#torchtree-plug-in)
 
 ## Getting Started
@@ -44,6 +45,11 @@ Check install
 torchtree --help
 ```
 
+## Documentation
+For detailed information on how to use `torchtree` and its features, please refer to the official documentation and API reference.
+ - [Documentation](https://4ment.github.io/torchtree)
+ - [API Reference](https://4ment.github.io/torchtree/autoapi/torchtree/index.html)
+
 ## Quick start
 `torchtree` requires a JSON file containing models and algorithms. A configuration file can be generated using `torchtree-cli`, a command line-based tool. This two-step process allows the user to adjust values in the configuration file, such as hyperparameters.
 
@@ -69,12 +75,15 @@ torchtree fluA.json
 ```
 
 ## torchtree plug-in
-torchtree can be easily extended without modifying the code base thanks its modular implementation. Some examples of external packages
+torchtree can be easily extended without modifying the code base thanks its modular implementation. Some examples of plug-ins:
+
 - [torchtree-bito]
 - [torchtree-physher]
 - [torchtree-scipy]
 - [torchtree-tensorflow]
 
+A GitHub [template](https://github.com/4ment/torchtree-plugin-template) is available to assist in the development of a plug-in, and it is highly recommended to use it. This template provides a structured starting point, ensuring consistency and compatibility with `torchtree` while streamlining the development process.
+
 ## License
 
 Distributed under the GPLv3 License. See [LICENSE](LICENSE) for more information.
diff --git a/docs/_static/custom.css b/docs/_static/custom.css
@@ -0,0 +1,5 @@
+/* Custom color for inline code with the "keycode" role */
+.keycode {
+    color: #005b82;  /* Change this to your desired color */
+    background-color: #f0f8ff;  /* Optional: Add background color */
+}
diff --git a/docs/advanced/concepts.rst b/docs/advanced/concepts.rst
@@ -0,0 +1,95 @@
+Building blocks
+===============
+
+Parameter object
+----------------
+:py:class:`~torchtree.core.parameter.Parameter` objects play a central role in torchtree. They are used to define the parameters of models and distributions, and are involved in any kind of optimization or inference.
+A Parameter object contains a reference to a pytorch tensor object which can be accessed using the :py:attr:`~torchtree.core.parameter.Parameter.tensor` property.
+There are different ways to define a Parameter object in a JSON file. The most common way is to define a :keycode:`tensor` key associated with a list of real numbers as shown below:
+
+.. code-block:: JSON
+
+    {
+        "id": "gtr_frequencies",
+        "type": "Parameter",
+        "tensor": [0.25, 0.25, 0.25, 0.25]
+    }
+
+Inside torchtree, this JSON object will be converted to a python object:
+
+.. code-block:: python
+
+    Parameter("gtr_frequencies", torch.tensor([0.25, 0.25, 0.25, 0.25]))
+
+Another way to define the same object using a different initialization method is:
+
+.. code-block:: JSON
+
+    {
+        "id": "gtr_frequencies",
+        "type": "Parameter",
+        "full": [4],
+        "value": 0.25
+    }
+
+which in python will be converted to:
+
+.. code-block:: python
+    :linenos:
+
+    Parameter("gtr_frequencies", torch.full([4], 0.25))
+
+
+TransformedParameter object
+---------------------------
+In torchtree, :py:class:`~torchtree.core.parameter.Parameter` objects are typically considered to be unconstrained.
+Optimizers (such as those used in ADVI and MAP) and samplers (e.g. HMC) will change the value of the tensor they encapsulate without checking if the new value is within the parameter's domain.
+However, in many cases, phylogenetic models contain constrained parameters such as branch lengths (positive real numbers) or equilibrium base frequencies (positive real numbers that sum to 1).
+For example, the GTR model expects the equilibrium base frequencies to be positive real numbers that sum to 1 and a standard optimizer will ignore such constraints.
+:py:class:`~torchtree.core.parameter.TransformedParameter` objects allow moving from unconstrained to constrained spaces using `transform <https://pytorch.org/docs/stable/distributions.html#torch.distributions.transforms.Transform>`_ objects available in pytorch.
+
+We can replace the JSON object defining the GTR equilibrium base frequencies with a TransformedParameter object as shown below:
+
+.. code-block:: JSON
+
+    {
+        "id": "gtr_frequencies",
+        "type": "TransformedParameter",
+        "transform": "torch.distributions.StickBreakingTransform",
+        "x":{
+            "id": "gtr_frequencies_unconstrained",
+            "type": "TransformedParameter",
+            "type": "Parameter",
+            "zeros": [3]
+        }
+    }
+
+This is equivalent to the following python code:
+
+.. code-block:: python
+
+    import torch
+    from torchtree import Parameter, TransformedParameter
+
+    unconstrained = Parameter("gtr_frequencies_unconstrained", torch.zeros([3]))
+    transform = torch.distributions.StickBreakingTransform()
+    constrained = TransformedParameter("gtr_frequencies", unconstrained, transform)
+    
+An optimizer will change the value of the **gtr_frequencies_unconstrained** Parameter object and the **gtr_frequencies** (transformed) parameter will apply the StickBreakingTransform transform to the value of **gtr_frequencies_unconstrained** to update the transition rate matrix.
+
+In this example, we are using the `StickBreakingTransform <https://pytorch.org/docs/stable/distributions.html#torch.distributions.transforms.StickBreakingTransform>`_ object that will transform the unconstrained parameter **gtr_frequencies_unconstrained** to a constrained parameter **gtr_frequencies**.
+Note the value of the :keycode:`transform` key is a string containing the full path to the pytorch class that implements the transformation.
+Specifically, ``torch`` is the package name, ``distributions`` is the module name, and ``StickBreakingTransform`` is the class name.
+
+
+Models and CallableModels
+-------------------------
+Virtually every torchtree object that does some kind of computations inherits from the :py:class:`~torchtree.core.model.Model` class.
+Computations can involve Parameter and/or other Model objects.
+The Distribution class we described earlier is derived from the class Model since it defines a probability distribution and return a log probability.
+The GTR substitution model is also a Model object since its role is to calculate a transition probability matrix.
+
+A model that returns a value when called is said to be *callable* and it extends the :py:class:`~torchtree.core.model.CallableModel` abstract class.
+A distribution is a callable model since it returns the log probability of a sample.
+The class representing a tree likelihood model is also callable since it calculates the log likelihood and we will describe it further in the next section.
+
diff --git a/docs/advanced/tree_likelihood.rst b/docs/advanced/tree_likelihood.rst
@@ -0,0 +1,128 @@
+Tree likelihood model
+=====================
+
+**torchtree** is designed to infer parameters of phylogenetic models, with every analysis containing at least one tree likelihood object.
+This object is responsible for calculating the probability of an alignment given a tree and its associated parameters.
+Below, we describe the structure of a tree likelihood object through its JSON representation.
+
+.. code-block:: JSON
+    :linenos:
+
+    [
+     {
+       "id": "taxa",
+       "type": "Taxa",
+       "taxa": [
+         {
+           "id": "A",
+           "type": "Taxon"
+         },
+         {
+           "id": "B",
+           "type": "Taxon"
+         },
+         {
+           "id": "C",
+           "type": "Taxon"
+         }
+       ]
+     },
+     {
+       "id": "alignment",
+       "type": "Alignment",
+       "datatype": {
+         "id": "data_type",
+         "type": "NucleotideDataType"
+       },
+       "taxa": "taxa",
+       "sequences": [
+         {
+           "taxon": "A",
+           "sequence": "ACGT"
+         },
+         {
+           "taxon": "B",
+           "sequence": "AATT"
+         },
+         {
+           "taxon": "C",
+           "sequence": "ACTT"
+         }
+       ]
+     },
+     {
+       "id": "like",
+       "type": "TreeLikelihoodModel",
+       "tree_model": {
+         "id": "tree",
+         "type": "UnrootedTreeModel",
+         "newick": "((A:0.1,B:0.2):0.3,C:0.4);",
+         "branch_lengths": {
+           "id": "branch_lengths",
+           "type": "Parameter",
+           "tensor": 0.1,
+           "full": [
+            3
+          ]
+         },
+         "site_model": {
+           "id": "sitemodel",
+           "type": "ConstantSiteModel"
+         },
+         "substitution_model": {
+           "id": "substmodel",
+           "type": "GTR",
+           "rates": {
+             "id": "gtr_rates",
+             "type": "Parameter",
+             "tensor": 0.16666,
+             "full": [
+              6
+             ]
+           },
+           "frequencies": {
+             "id": "gtr_frequencies",
+             "type": "Parameter",
+             "full": 0.25,
+             "tensor": [
+               4
+             ]
+           }
+         },
+         "site_pattern": {
+           "id": "patterns",
+           "type": "SitePattern",
+           "alignment": "alignment"
+         }
+       }
+     }
+    ]
+
+The first object with type ``Taxa`` defines the taxa in the alignment. Each taxon is defined by an object with type ``Taxon`` and it might contain additional information such sampling date and geographic location.
+The second object is an alignment object with type :py:class:`~torchtree.evolution.alignment.Alignment` which contains the sequences of the taxa defined in the previous object.
+The third object is a tree likelihood model with type :py:class:`~torchtree.evolution.tree_likelihood.TreeLikelihoodModel` and is composed of four sub-models:
+
+* :keycode:`tree_model`: A tree model extending the :py:class:`~torchtree.evolution.tree_model.TreeModel` class which contains the tree topology and its associated parameters.
+* :keycode:`site_model`: A site model extending the :py:class:`~torchtree.evolution.site_model.SiteModel`  class which contains rate heterogeneity parameters across sites, if any.
+* :keycode:`substitution_model`: A substitution model extending the :py:class:`~torchtree.evolution.substitution_model.abstract.SubstitutionModel` class which contains the paramteres that parameterize the substitution process.
+* :keycode:`site_pattern`: A site pattern model extending the :py:class:`~torchtree.evolution.site_pattern.SitePattern` class which contains the compressed alignment defined in the alignment object.
+
+An optional sub-model extending the :py:class:`~torchtree.evolution.branch_model.BranchModel` class can be added to the tree likelihood model to model the rate of evolution across branches using the :keycode:`branch_model` key.
+
+In the JSON object above, we have specified a tree likelihood model for an unrooted tree with a GTR substitution model and equal rate across sites.
+
+This modular design allows the definition of different tree likelihood models using different combinations of the sub-models.
+
+For example if we wanted to define a tree likelihood model with a proportion of invariant sites we would change the value of the :keycode:`site_model` key to:
+
+.. code-block:: JSON
+
+    {
+      "id": "sitemodel",
+      "type": "InvariantSiteModel",
+      "invariant": {
+        "id": "proportion",
+        "type": "Parameter",
+        "tensor": 0.5
+      }
+    }
diff --git a/docs/conf.py b/docs/conf.py
@@ -13,6 +13,8 @@
 import os
 import sys
 from datetime import date
+from docutils import nodes
+from docutils.parsers.rst import roles
 
 sys.path.insert(0, os.path.abspath('..'))
 
@@ -90,3 +92,14 @@
     "python": ("https://docs.python.org/3/", None),
     "torch": ("https://pytorch.org/docs/master/", None),
 }
+
+def colorcode_role(name, rawtext, text, lineno, inliner, options={}, content=[]):
+    # Create a literal node with the class "keycode"
+    node = nodes.literal(text, text, classes=["keycode"])
+    return [node], []
+
+# Register the new role
+roles.register_local_role('keycode', colorcode_role)
+
+def setup(app):
+    app.add_css_file('custom.css')
diff --git a/docs/getting_started/install.rst b/docs/getting_started/install.rst
@@ -0,0 +1,41 @@
+Installation
+============
+
+Installing the latest stable version
+------------------------------------
+
+To install the latest stable version of **torchtree**, run the following command:
+
+.. code-block:: bash
+
+    pip install torchtree
+
+
+Building torchtree from source
+------------------------------
+
+If you'd like to build **torchtree** from source, follow these steps:
+
+.. code-block:: bash
+
+    git clone https://github.com/4ment/torchtree
+    pip install torchtree/
+
+
+Programs Installed
+------------------
+
+By following either installation method, the following two programs will be installed:
+
+* :command:`torchtree-cli`: A command-line interface for generating JSON configuration files for your analyses.
+* :command:`torchtree`: The main program that runs inference algorithms using the provided JSON configuration file.
+
+
+To verify the installation or explore available options, you can use the following commands:
+
+.. code-block:: bash
+
+    torchtree --help
+    torchtree-cli --help
+
+These commands will display usage information and available options for both tools.
diff --git a/docs/getting_started/json_reference.rst b/docs/getting_started/json_reference.rst
diff --git a/docs/getting_started/quick_start.rst b/docs/getting_started/quick_start.rst
diff --git a/docs/index.rst b/docs/index.rst