Title: | Data Analysis Tools over Space of Phylogenetic Trees Using Tropical Geometry |
---|---|
Description: | Process phylogenetic trees with tropical support vector machine and principal component analysis defined with tropical geometry. Details about tropical support vector machine are available in : Tang, X., Wang, H. & Yoshida, R. (2020) <arXiv:2003.00677>. Details about tropical principle component analysis are available in : Page, R., Yoshida, R. & Zhang L. (2020) <doi:10.1093/bioinformatics/btaa564> and Yoshida, R., Zhang, L. & Zhang, X. (2019) <doi:10.1007/s11538-018-0493-4>. |
Authors: | Houjie Wang [aut, cre], Kaizhang Wang [aut], Grady Weyenberg [aut], Xiaoxian Tang [aut], Ruriko Yoshida [aut] |
Maintainer: | Houjie Wang <[email protected]> |
License: | GPL-3 |
Version: | 1.2.1 |
Built: | 2025-01-22 05:00:34 UTC |
Source: | https://github.com/houjiewang/rtropical |
Simulated Tree Data with Different Proximity Parameter Value
data(all_trees)
data(all_trees)
A list of length 12 with each element a sublist containing an
ape multiPhylo
object with 300 rooted trees on 5 tips and the tree categories.
Apicomplexa gene trees sample data set.
apicomplexa
apicomplexa
An ape multiPhylo
object with 268 rooted trees on 8 tips.
Chih-Horng Kuo, John P. Wares, Jessica C. Kissinger. The Apicomplexan Whole-Genome Phylogeny: An Analysis of Incongruence among Gene Trees Molecular Biology and Evolution, Volume 25, Issue 12, December 2008, Pages 2689–2698.
Unifies tip labels of all phylogenetic trees in multiPhylo
object the same as the first tree and returns the
cophenetic distance of their corresponding chronogram.
## S3 method for class 'multiPhylo' as.matrix(x, tipOrder = x[[1]]$tip.label, parallel = FALSE, ncores = 2, ...)
## S3 method for class 'multiPhylo' as.matrix(x, tipOrder = x[[1]]$tip.label, parallel = FALSE, ncores = 2, ...)
x |
an object of class |
tipOrder |
a numeric vector of order of leaf names to which all trees
in the |
parallel |
a logical value indicating if parallel computing should be used. (default: FALSE) |
ncores |
a numeric value indicating the number of threads utilized for multi-cored CPUs. (default: 2) |
... |
Not used. Other arguments to as.vector |
A data matrix with each row a vector representation of a chronogram. Each element of the vector is the distance between two leaves.
data(apicomplexa) data <- as.matrix(apicomplexa[1: 10]) # matrixize first ten trees
data(apicomplexa) data <- as.matrix(apicomplexa[1: 10]) # matrixize first ten trees
Computes the cophenetic distance and outputs them in a vector
of a phylogenetic tree in phylo
object
## S3 method for class 'phylo' as.vector(x, mode = "any")
## S3 method for class 'phylo' as.vector(x, mode = "any")
x |
A object of class phylo |
mode |
The same as |
A vector with its elements the distance between two leaves of the tree.
library(ape) tree <- rcoal(5) tree_vec <- as.vector(tree)
library(ape) tree <- rcoal(5) tree_vec <- as.vector(tree)
Obtain the optimal tropical hyperplane in the form of vectors from a cv.tropsvm object.
## S3 method for class 'cv.tropsvm' coef(object, ...)
## S3 method for class 'cv.tropsvm' coef(object, ...)
object |
a fitted |
... |
Not used. Other arguments. |
An output of the apex of the fitted optimal tropical hyperplane.
Obtain the optimal tropical hyperplane in the form of vectors from a tropsvm object.
## S3 method for class 'tropsvm' coef(object, ...)
## S3 method for class 'tropsvm' coef(object, ...)
object |
a fitted |
... |
Not used. Other arguments. |
An output of the apex of the fitted optimal tropical hyperplane.
Conduct k-fold cross validation for tropsvm and return an object "cv.tropsvm"
.
cv.tropsvm(x, y, parallel = FALSE, nfold = 10, nassignment = 10, ncores = 2)
cv.tropsvm(x, y, parallel = FALSE, nfold = 10, nassignment = 10, ncores = 2)
x |
a data matrix, of dimension nobs x nvars; each row is an observation vector. |
y |
a response vector with one label for each row/component of x. |
parallel |
a logical value indicating if parallel computing should be used. (default: FALSE) |
nfold |
a numeric value of the number of data folds for cross-validation. (default: 10) |
nassignment |
a numeric value indicating the size of the parameter grid of assignments. (default: 10) |
ncores |
a numeric value indicating the number of threads utilized for multi-cored CPUs. (default: 2) |
object with S3 class cv.tropsvm
containing the fitted model, including:
apex |
The negative apex of the fitted optimal tropical hyperplane. |
assignment |
The best |
index |
The best classification method tuned by cross-validation. |
levels |
The name of each category, consistent with categories in |
accuracy |
The validation accuracy for each fold. |
nfold |
The number of folds used in cross-validation. |
summary
, predict
, coef
and the tropsvm
function.
# data generation library(Rfast) set.seed(101) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) summary(cv_tropsvm_fit) coef(cv_tropsvm_fit) # test with new data pred <- predict(cv_tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
# data generation library(Rfast) set.seed(101) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) summary(cv_tropsvm_fit) coef(cv_tropsvm_fit) # test with new data pred <- predict(cv_tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
Coelacanths genome and transcriptome data
lungfish
lungfish
An ape multiPhylo
object with 1193 rooted trees on 10 tips.
Tom M. W. Nye, Xiaoxian Tang, Grady Weyenberg and Ruriko Yoshida. Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees, Biometrika, Volume 104, Issue 4, December 2017, Pages 901–922.
Visualize the second order tropical principle components in troppca
as a tropical triangle with projections on a two-dimensional plot via tropical isometry.
## S3 method for class 'troppca' plot( x, point_class = NULL, fw = FALSE, plot.tree = FALSE, leaf.names = NULL, ntopology = 6, ... )
## S3 method for class 'troppca' plot( x, point_class = NULL, fw = FALSE, plot.tree = FALSE, leaf.names = NULL, ntopology = 6, ... )
x |
a fitted |
point_class |
a vector of labels of all points in the given data matrix. Not needed for unlabeled data. (default: NULL) |
fw |
a logical variable to determine if to add Fermat-Weber point of the data projection. (default: FALSE) |
plot.tree |
a logical value indicating if to plot tree topology (default: FALSE) |
leaf.names |
the name of the leaves of the first tree in the data set (default: NULL) |
ntopology |
a numeric value indicating the number of most frequent tree topology to plot (default: 6) |
... |
Not used. Other arguments to plot |
plot.troppca
does not return anything other than the plot.
Predicts values based upon a model trained by cv.tropsvm
.
## S3 method for class 'cv.tropsvm' predict(object, newx, ...)
## S3 method for class 'cv.tropsvm' predict(object, newx, ...)
object |
a fitted |
newx |
a data matrix, of dimension nobs x nvars used as testing data. |
... |
Not used. Other arguments to predict. |
A vector of predicted values of a vector of labels.
summary
, coef
and the cv.tropsvm
function.
# data generation library(Rfast) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) # test with new data pred <- predict(cv_tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
# data generation library(Rfast) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) # test with new data pred <- predict(cv_tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
Predicts values based upon a model trained by tropsvm
.
## S3 method for class 'tropsvm' predict(object, newx, ...)
## S3 method for class 'tropsvm' predict(object, newx, ...)
object |
a fitted |
newx |
a data matrix, of dimension nobs x nvars used as testing data. |
... |
Not used. Other arguments to predict. |
A vector of predicted values of a vector of labels.
summary
, coef
and the tropsvm
function.
# data generation library(Rfast) e <- 100 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm tropsvm_fit <- tropsvm(x, y, auto.assignment = TRUE, ind = 1) # test with new data pred <- predict(tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
# data generation library(Rfast) e <- 100 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm tropsvm_fit <- tropsvm(x, y, auto.assignment = TRUE, ind = 1) # test with new data pred <- predict(tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
Read NEXUS-formatted trees from two categories into a data matrix
read.nexus.to.data.matrix(data.file1, data.file2)
read.nexus.to.data.matrix(data.file1, data.file2)
data.file1 |
A data set with trees from one category. |
data.file2 |
A data set with trees from the other category. |
A data matrix with the first x rows corresponding the x trees in the first file and the last y rows are the trees from the second file.
Read Newick-formatted trees in two categories into a data matrix
read.tree.to.data.matrix(data.file1, data.file2)
read.tree.to.data.matrix(data.file1, data.file2)
data.file1 |
A file containing trees in Newick form in a category. |
data.file2 |
A file containing trees in Newick form in an assumed different category. |
read.tree.to.data.matrix
has the same return as read.nexus.to.data.matrix
.
Simulated Tree Data
data(sim_trees)
data(sim_trees)
An ape multiPhylo
object with 300 rooted trees on 5 tips.
Return a summary with a more detailed explanation of the object "cv.tropsvm"
.
## S3 method for class 'cv.tropsvm' summary(object, ...)
## S3 method for class 'cv.tropsvm' summary(object, ...)
object |
a fitted |
... |
Not used. Other arguments to summary. |
A summary of the crucial information of a tropical support vector machine is printed, including the selected best assignment and classification methods and the validation accuracy of each data fold. The summary section of classification methods specifies the sectors and their intersections used to classify points of two different categories.
predict
, coef
and the cv.tropsvm
function.
# data generation library(Rfast) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm with cross-validation cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) summary(cv_tropsvm_fit)
# data generation library(Rfast) e <- 20 n <- 10 N <- 10 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm with cross-validation cv_tropsvm_fit <- cv.tropsvm(x, y, parallel = FALSE) summary(cv_tropsvm_fit)
Compute the tropical determinant for a given matrix. This is equivalent to solving an assignment problem.
tropdet(x)
tropdet(x)
x |
a square matrix |
The determinant of the given matrix,
R <- matrix(sample(1:9, 9), nrow = 3) tropdet(R)
R <- matrix(sample(1:9, 9), nrow = 3) tropdet(R)
Compute the tropical Fermat-Weber (FW) point for a given data matrix. The FW point minimizes the summed tropical distance to the trees described in the data matrix.
tropFW(x)
tropFW(x)
x |
a data matrix, of dimension nobs x nvars; each row is an observation vector. |
A list containing:
fw |
The fermat-weber point. |
distsum |
The sum of distance from each observation to the fermat-weber point. |
x <- matrix(rnorm(100), ncol = 10) tropFW(x)
x <- matrix(rnorm(100), ncol = 10) tropFW(x)
Approximate the principal component as a tropical linear space
for a given data matrix and returns the results as an object of class troppca
.
troppca.linsp(x, pcs = 2, iteration = list(), ncores = 2)
troppca.linsp(x, pcs = 2, iteration = list(), ncores = 2)
x |
a data matrix, of size n x e, with each row an observation vector. e is the dimension of the tropical space |
pcs |
a numeric value indicating the order of principal component. (default: 2) |
iteration |
a list with arguments controlling the iteration of the algorithm.
|
ncores |
a numeric value indicating the number of threads utilized for multi-cored CPUs. (default: 2) |
A list of S3 class "troppca"
, including:
pc |
The principal component as a tropical linear space |
obj |
The tropical PCA objective, the sum of tropical distance from each point to the projection. |
projection |
The projections of all data points. |
type |
The geometry of principal component. |
library(Rfast) n <- 100 e <- 10 sig2 <- 1 x <- rbind(rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(sig2, e))) troppca_fit <- troppca.linsp(x)
library(Rfast) n <- 100 e <- 10 sig2 <- 1 x <- rbind(rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(sig2, e))) troppca_fit <- troppca.linsp(x)
Approximate the principal component as a tropical polytope converted from tropical linear space for a given data matrix
via MCMC and return the results as an object of class troppca
.
troppca.linsp2poly(x, pcs = 2, iteration = list(), ncores = 2)
troppca.linsp2poly(x, pcs = 2, iteration = list(), ncores = 2)
x |
a data matrix, of size n x e, with each row an observation vector. e is the dimension of the tropical space |
pcs |
a numeric value indicating the order of principal component. (default: 2) |
iteration |
a list with arguments controlling the iteration of the algorithm.
|
ncores |
a numeric value indicating the number of threads utilized for multi-cored CPUs. (default: 2) |
A list of S3 class "troppca"
, including:
pc |
The principal component as a tropical linear space |
obj |
The tropical PCA objective, the sum of tropical distance from each point to the projection. |
projection |
The projections of all data points. |
type |
The geometry of principal component. |
library(Rfast) n <- 50 e <- 50 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) troppca_fit <- troppca.linsp2poly(x)
library(Rfast) n <- 50 e <- 50 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) troppca_fit <- troppca.linsp2poly(x)
Approximates the principal component as a tropical polytope for a given data matrix
via MCMC and return the results as an object of class troppca
.
troppca.poly(x, pcs = 2, nsample = 1000, ncores = 2)
troppca.poly(x, pcs = 2, nsample = 1000, ncores = 2)
x |
a data matrix, of size n x e, with each row an observation vector. e is the dimension of the tropical space#' |
pcs |
a numeric value indicating the order of principal component. (default: 2) |
nsample |
a numeric value indicating the number of samples of MCMC. (default: 1000) |
ncores |
a numeric value indicating the number of threads utilized for multi-cored CPUs. (default: 2) |
A list of S3 class "troppca"
, including:
pc |
The principal component as a tropical linear space |
obj |
The tropical PCA objective, the sum of tropical distance from each point to the projection. |
projection |
The projections of all data points. |
type |
The geometry of principal component. |
library(Rfast) n <- 50 e <- 50 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) troppca_fit <- troppca.poly(x) plot(troppca_fit)
library(Rfast) n <- 50 e <- 50 s <- 5 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) troppca_fit <- troppca.poly(x) plot(troppca_fit)
Compute projection of data points on a given tropical linear space.
tropproj.linsp(x, V)
tropproj.linsp(x, V)
x |
a data matrix, of size n x e, with each row an observation. |
V |
a data matrix, of dimension s x e, with each row a basis of tropical linear space. e is the dimension of the tropical space and s is the dimension of the linear space. |
A matrix of projections of all data points.
library(Rfast) n <- 100 e <- 10 sig2 <- 1 s <- 3 x <- rbind(rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(sig2, e))) V <- matrix(runif(s * e, -10, 10), nrow = s, ncol = e) x_proj <- tropproj.linsp(x, V) head(x_proj)
library(Rfast) n <- 100 e <- 10 sig2 <- 1 s <- 3 x <- rbind(rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(sig2, e))) V <- matrix(runif(s * e, -10, 10), nrow = s, ncol = e) x_proj <- tropproj.linsp(x, V) head(x_proj)
Project a point onto a given tropical polytope.
tropproj.poly(x, tconv)
tropproj.poly(x, tconv)
x |
a data vector, of length e. |
tconv |
a data matrix, of size e x s, with each column a vertex of the tropical polytope. e is the dimension of the tropical space and s is the number of vertices of the polytope |
A projected vector on the given tropical polytope.
# Generate a tropical polytope consisting of three trees each with 5 leaves library(ape) pltp <- sapply(1:3, function(i) { as.vector(rcoal(5)) }) # Generate an observation and vectorize it tree <- rcoal(5) tree_vec <- as.vector(tree) tropproj.poly(tree_vec, pltp)
# Generate a tropical polytope consisting of three trees each with 5 leaves library(ape) pltp <- sapply(1:3, function(i) { as.vector(rcoal(5)) }) # Generate an observation and vectorize it tree <- rcoal(5) tree_vec <- as.vector(tree) tropproj.poly(tree_vec, pltp)
Fit a discriminative two-class classifier via linear programming defined by the tropical hyperplane which maximizes the minimum tropical distance from data points to itself in order to separate the data points into sectors (half-spaces) in the tropical projective torus.
tropsvm(x, y, auto.assignment = FALSE, assignment = NULL, ind = 1)
tropsvm(x, y, auto.assignment = FALSE, assignment = NULL, ind = 1)
x |
a data matrix, of dimension nobs x nvars; each row is an observation vector. |
y |
a response vector with one label for each row/component of x. |
auto.assignment |
a logical value indicating if to provide an |
assignment |
a numeric vector of length 4 indicating the sectors of tropical hyperplane that the
data will be assigned to. The first and third elements in the |
ind |
a numeric value or a numeric vector ranging from 1 to 70 indicating which classification method
to be used. There are 70 different classification methods. Details of a given method can be retrieved by |
An object with S3 class tropsvm
containing the fitted model, including:
apex |
The negative apex of the fitted optimal tropical hyperplane. |
assignment |
The user-input or auto-found |
index |
The user-input classification method. |
levels |
The name of each category, consistent with categories in |
predict
, coef
and the cv.tropsvm
function.
# data generation library(Rfast) e <- 100 n <- 10 N <- 100 s <- 10 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm tropsvm_fit <- tropsvm(x, y, auto.assignment = TRUE, ind = 1) coef(tropsvm_fit) # test with new data pred <- predict(tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)
# data generation library(Rfast) e <- 100 n <- 10 N <- 100 s <- 10 x <- rbind( rmvnorm(n, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(n, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) y <- as.factor(c(rep(1, n), rep(2, n))) newx <- rbind( rmvnorm(N, mu = c(5, -5, rep(0, e - 2)), sigma = diag(s, e)), rmvnorm(N, mu = c(-5, 5, rep(0, e - 2)), sigma = diag(s, e)) ) newy <- as.factor(rep(c(1, 2), each = N)) # train the tropical svm tropsvm_fit <- tropsvm(x, y, auto.assignment = TRUE, ind = 1) coef(tropsvm_fit) # test with new data pred <- predict(tropsvm_fit, newx) # check with accuracy table(pred, newy) # compute testing accuracy sum(pred == newy) / length(newy)