Tune the mtry
and ntree
random forest parameters using a grid search approach.
Usage
tune(
x,
cls = "class",
mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
mtry(x, cls = cls)/2, length.out = 4)),
ntree_range = 1000,
seed = 1234
)
# S4 method for AnalysisData
tune(
x,
cls = "class",
mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
mtry(x, cls = cls)/2, length.out = 4)),
ntree_range = 1000,
seed = 1234
)
Arguments
- x
S4 object of class
AnalysisData
- cls
sample information column to use
- mtry_range
numeric vector of
mtry
values to search- ntree_range
numeric vector of
ntree
values to search- seed
random number seed
Value
A list containing the optimal mtry
and ntree
parameters.
This is suitable for use as the rf
argument in method randomForest()
.
Details
Parameter tuning is performed by grid search of all combinations of the mtry_range
and ntree_range
vectors provided.
The optimal parameter values are selected using the out-of-bag error estimates of the margin
metric for classification and the rmse
(root-mean-square error) metric for regression.
Examples
library(metaboData)
## Prepare some data
x <- analysisData(abr1$neg[,200:300],abr1$fact) %>%
occupancyMaximum(cls = 'day') %>%
transformTICnorm()
## Tune the `mtry` parameter for the `day` response
tune(x,cls = 'day')
#> $mtry
#> [1] 9
#>
#> $ntree
#> [1] 1000
#>