Skip to contents

Perform structural enrichment using over-representation analysis of explanatory m/z features from random forest.

Usage

structuralEnrichment(
  x,
  structural_classifications,
  p_adjust_method = "bonferroni",
  split = c("none", "trends"),
  ...
)

# S4 method for RandomForest,tbl_df
structuralEnrichment(
  x,
  structural_classifications,
  p_adjust_method = "bonferroni",
  split = c("none", "trends"),
  ...
)

# S4 method for RandomForest,Construction
structuralEnrichment(
  x,
  structural_classifications,
  p_adjust_method = "bonferroni",
  split = c("none", "trends"),
  ...
)

Arguments

x

an object of S4 class RandomForest

structural_classifications

the structral classifications corresponding to the m/z features present in the object specified for argument x. This should either be a tibble as returned by construction::classifications() or an object of S4 class Construction.

p_adjust_method

the p-value adjustment method. One of those returned from p.adjust.methods.

split

split the explanatory features into further groups based on their trends. See details.

...

arguments to pass to metabolyseR::explanatoryFeatures()

Value

An object of S4 class StructuralEnrichment.

Details

Over-representation analysis is performed on the explanatory m/z features for each structural class within each experimental class comparison using the Fisher's Exact Test.

For argument split = 'trends', the explanatory features can be split into further groups based on their trends. This is not supported for unsupervised random forest.

For random forest classification, this is for binary comparisons only. Functional enrichment is performed seperately on the up and down regulated explanatory features for each comparison. The up regulated and down regulated groups are based on the trends of log2 ratios between the comparison classes. up regulated explanatory features have a higher median intensity in the right-hand class compared to the left-hand class of the comparison. The opposite is true for the down regulated explanatory features.

For random forest regression, the explanatory features are split based on their Spearman's correlation coefficient with the response variable prior to functional enrichment analysis giving positively correlated and negatively correlated subgroups.

Examples

## Perform random forest on the example data 
random_forest <- assigned_data %>% 
  metabolyseR::randomForest(
    cls = 'class'
  )

## Perform structural enrichment analysis using the example structural classifications
structuralEnrichment(
  random_forest,
  structural_classifications
)
#> 
#> Random forest classification 
#> 
#> Samples:	 60 
#> Features:	 1706 
#> Response:	 class 
#> # comparisons:	 1 
#> 
#> 153 explanatory m/z features.
#> 88 structural classes total.
#> 5 significantly enriched structural classes.

## An example using split trends
## Perform random forest on the example data 
random_forest <- assigned_data %>% 
  metabolyseR::randomForest(
    cls = 'class',
    binary = TRUE
  )

## Perform structural enrichment analysis using the example structural classifications
structuralEnrichment(
  random_forest,
  structural_classifications,
  split = 'trends'
)
#> 
#> Random forest classification 
#> 
#> Samples:	 60 
#> Features:	 1706 
#> Response:	 class 
#> # comparisons:	 6 
#> 
#> 443 explanatory m/z features.
#> 88 structural classes total.
#> 75 significantly enriched structural classes.