Skip to contents

Tune the mtry and ntree random forest parameters using a grid search approach.

Usage

tune(
  x,
  cls = "class",
  mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
    mtry(x, cls = cls)/2, length.out = 4)),
  ntree_range = 1000,
  seed = 1234
)

# S4 method for AnalysisData
tune(
  x,
  cls = "class",
  mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
    mtry(x, cls = cls)/2, length.out = 4)),
  ntree_range = 1000,
  seed = 1234
)

Arguments

x

S4 object of class AnalysisData

cls

sample information column to use

mtry_range

numeric vector of mtry values to search

ntree_range

numeric vector of ntree values to search

seed

random number seed

Value

A list containing the optimal mtry and ntree parameters. This is suitable for use as the rf argument in method randomForest().

Details

Parameter tuning is performed by grid search of all combinations of the mtry_range and ntree_range vectors provided. The optimal parameter values are selected using the out-of-bag error estimates of the margin metric for classification and the rmse (root-mean-square error) metric for regression.

Examples

library(metaboData)

## Prepare some data
x <- analysisData(abr1$neg[,200:300],abr1$fact) %>%
  occupancyMaximum(cls = 'day') %>%
  transformTICnorm()

## Tune the `mtry` parameter for the `day` response
tune(x,cls = 'day')
#> $mtry
#> [1] 9
#> 
#> $ntree
#> [1] 1000
#>