Skip to content
Oriol Guitart edited this page Feb 16, 2015 · 3 revisions

TDC Network Inference App


The TDC Network Inference App is the Cytoscape application for the winner of the DREAM8 Breast Cancer Network prediction challenge 1A. This new network inference method consists of two components: a biological prior component and a Granger-causality-based algorithm component.

1. Biological Prior Component

For each pathway, a heat diffusion process is used to derive an update for all of the nodes in each graph G. The approach does not consider the directionality of an edge in G, making use of the unsigned adjacency matrix A, where A[i,j]=A[j,i]=1 whenever an edge is present between two nodes i and j. Let D be a diagonal matrix with D[i,i] equal to the number of other genes that gene i interacts with, and B[i,j]=0 for all i not equal to j. The heat diffusion-based update is the matrix exponential of (D-A)*t, where D-A is known as the Laplacian matrix of G, its exponentiated form is known as the diffusion kernel, and t is an arbitrary time step set to 0.1, which has been shown to be a useful in the context of biological network discovery (Vandin et. Al, 2012, Paull et. Al, 2013).

Submatrices are extracted from each diffusion kernel by keeping only rows and columns in the matrices that correspond to antibody-targeted proteins. The extracted submatrices are then combined, creating one matrix that contains only antibody-targeted proteins. The resulting summary matrix is used by re-mapping the protein names back to the names of the antibodies that target them.

2. Granger-causality-based Component

LASSO-penalized regression is applied to infer causal relationships. At each time point, each of the data probes are regressed on all other values in the data matrix from the same cell line and stimulus conditions. Different inhibitors are treated as different examples for the same regression task. The algorithm chooses the LASSO meta-parameter value such that all auto-regression weights are exactly zero. Any remaining non-zero weights are therefore assumed to contain causal information beyond the auto-regression.

The algorithm distinguish causal links based on the temporal positioning of non-zero weights. For a given response probe, if a non-zero weight appears in a feature probe from a previous or current time point, that weight is assigned to the inferred edge from the feature probe to the response probe. Conversely, if a non-zero weight comes from a future time point, the weight is assigned to the edge from the response probe to the feature probe.

Each regression task is normalized by the sum of the absolute values of the regression weights, so that each weight is on the interval [-1,1]. Finally, to combine across time points, the values for each edge are simply mean-averaged.