Structural node embeddings algorithms of the AntibodyForests networks. Supported algorithms include: node2vec (https://arxiv.org/abs/1607.00653) and spectral graph embedding on either the adjacency or the Laplacian matrix. Currently the node2vec model is supported as long as Rkeras is installed.

AntibodyForests_embeddings(
  trees,
  graph.type,
  embedding.method,
  dim.reduction,
  color.by,
  num.walks,
  num.steps,
  p,
  q,
  window.size,
  num.negative.samples,
  embedding.dim,
  batch.size,
  epochs,
  tsne.perplexity,
  seed,
  parallel
)

Arguments

trees

AntibodyForests object/list of AntibodyForests objects - the resulting sequence similarity or minimum spanning tree networks from the AntibodyForests function

graph.type

string - the graph type available in the AntibodyForests object which will be used as the function input. Currently supported network/analysis types: 'tree' (for the minimum spanning trees or sequence similarity networks obtained from the main AntibodyForests function), 'heterogeneous' for the bipartite graphs obtained via AntibodyForests_heterogeneous, 'dynamic' for the dynamic networks obtained from AntibodyForests_dynamics.

embedding.method

string - the embeddings model/algorithm. 'node2vec' for an implementation of graph random walk and node2vec using R-keras (might be slow depending on graph size), 'spectral_adjacency' for spectral graph embeddings of the adjacency matrix (using igraph's embed_adjacency_matrix() function), 'spectral_laplacian' for embedding the Laplacian matrix (using igraph's embed_laplacian_matrix() function).

dim.reduction

string - dimensionality reduction algorithm for the resulting node2vec embeddings. Currently implemented methods include: 'umap', 'tsne' and 'pca'.

color.by

vector of strings - features to color the resulting scatter plots by. These features must be included as igraph vertex attributes when creating the AntibodyForests objects, by including them in the node.features parameter.

num.walks

integer - number of biased random walks to be performed for the node2vec training dataset.

num.steps

integer - number of steps per biased random walk.

p

numeric - probability of revisiting the same node already vistied in a random walk step (= return parameter).

q

numeric - probability of 'jumping' to a node closer or farther away from the node visited at step x (e.g., q > 1, random walk is biased to closer nodes, q < 1, random walk will 'jump' to farher nodes more frequently).

window.size

integer - size of sampling window in the skipgram model.

num.negative.samples

integer - number of negative samples to be considered in the skipgram model.

embedding.dim

integer - latent/embedding dimension of the node2vec output vectors.

batch.size

integer - training batch size of the node2vec model.

epochs

integer - number of training epochs for the node2vec model.

tsne.perplexity

numeric - T-SNE reduction perplexity.

seed

integer - random seed for the random walk steps of the node2vec model.

parallel

boolean - whether to execute the random walks in parallel or not.

Value

A scatterplot of reduced vector embeddings for each node in the graphs, colored by the features specified in color.by.

Examples

if (FALSE) {
AntibodyForests_embeddings(output_networks,
graph.type = 'tree', embedding.method = 'node2vec',
dim.reduction = 'pca', num.walks = 10, num.steps = 10,
embedding.dim = 64, batch.size = 32, epochs = 50)
}