Tune the mtry and ntree random forest parameters using a grid search approach.
Usage
tune(
x,
cls = "class",
mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
mtry(x, cls = cls)/2, length.out = 4)),
ntree_range = 1000,
seed = 1234
)
# S4 method for AnalysisData
tune(
x,
cls = "class",
mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
mtry(x, cls = cls)/2, length.out = 4)),
ntree_range = 1000,
seed = 1234
)Arguments
- x
S4 object of class
AnalysisData- cls
sample information column to use
- mtry_range
numeric vector of
mtryvalues to search- ntree_range
numeric vector of
ntreevalues to search- seed
random number seed
Value
A list containing the optimal mtry and ntree parameters.
This is suitable for use as the rf argument in method randomForest().
Details
Parameter tuning is performed by grid search of all combinations of the mtry_range and ntree_range vectors provided.
The optimal parameter values are selected using the out-of-bag error estimates of the margin metric for classification and the rmse (root-mean-square error) metric for regression.
Examples
library(metaboData)
## Prepare some data
x <- analysisData(abr1$neg[,200:300],abr1$fact) %>%
occupancyMaximum(cls = 'day') %>%
transformTICnorm()
## Tune the `mtry` parameter for the `day` response
tune(x,cls = 'day')
#> $mtry
#> [1] 9
#>
#> $ntree
#> [1] 1000
#>