Skip to contents

Overview

This vignette outlines the grover package client-side functionality for accessing a raw mass spectrometry data repository hosted by grover API. This functionality includes repository contents querying, file transfer, file information retrieval, sample information retrieval and raw data conversion.

To get started we can first load the grover package.

Example API

For this example we will run an example grover API provided by the package. This will run as a background process to allow us to interact with the API without having to use an alterative R console. To activate this run the following:

grover_host <- grover(host = "127.0.0.1",
                     port = 8000,
                     auth = "1234",
                     repository = system.file('repository',
                                              package = 'grover'))

api <- groverAPI(grover_host,
                 background = TRUE,
                 log_dir = paste0(tempdir(),'/logs'),
                 temp_dir = paste0(tempdir(),'/temp'))

For further details on hosting a grover API see the Hosting a grover API vignette.

grover API host details

In order to access the API, we need to first provide the host details of the grover API. There are two ways to do this. The primary method is through the use of a configuration file that can then be parsed and specified when the API is activated. This should be in YAML format and have the structure shown below:

host: 127.0.0.1
port: 8000
auth: 1234

This specifies the host address, the port, and authentication key that matches the key with which the host has been configured.

The package contains an example file that we can parse here using the readGrover() function.

grover_client <- readGrover(system.file('grover_client.yml',package = 'grover'))

This returns an S4 object of class GroverClient that contains the API host information. The host information can be viewed by printing the object:

print(grover_client)
#> 
#> Grover Information
#> 
#> Host:        127.0.0.1 
#> Port:         8000 
#> Authentication:   1234

The second method, and the method that will be used for this example, is to specify the host details directly using the grover() function like the following:

grover_client <- grover(host = "127.0.0.1",
                     port = 8000,
                     auth = "1234")

This enables us to access the grover API hosting the example data repository included in the package.

Using this host information, we can first check that the API is running using the extant() function as shown below:

extant(grover_client)
#> [1] TRUE

Querying the data repository contents

We can list the instruments available within the data reposiroty using the following:

listInstruments(grover_client)
#> [1] "Thermo-Exactive"

As can be seen above, there is a single instrument available named Thermo-Exactive. To list the experiment directories available for this instrument, we can use the code below.

listDirectories(grover_client,'Thermo-Exactive')
#> [1] "Experiment_1"

This shows a single experiment data directory available. We can then list the contents of this directory to identify the data files available:

listFiles(grover_client,'Thermo-Exactive','Experiment_1')
#> [1] "QC01.raw"

We can see that there is a single raw data file available in this example repository.

File information

File Information such as file size and creation date be retrieved. We can see this for the example file QC01.raw using the following:

fileInfo(grover_client,'Thermo-Exactive','Experiment_1','QC01.raw')
#> # A tibble: 1 × 6
#>   instrument      directory    file     extension        size  birth_time
#>   <chr>           <chr>        <chr>    <chr>     <fs::bytes>       <dbl>
#> 1 Thermo-Exactive Experiment_1 QC01.raw raw             6.88M 1676905442.

This can also be done directory wide when multiple files are available.

directoryFileInfo(grover_client,'Thermo-Exactive','Experiment_1')
#> # A tibble: 1 × 6
#>   instrument      directory    file     extension        size  birth_time
#>   <chr>           <chr>        <chr>    <chr>     <fs::bytes>       <dbl>
#> 1 Thermo-Exactive Experiment_1 QC01.raw raw             6.88M 1676905442.

Transfer files

Individual files can be transferred from the repository using transferFile() , stipulating the instrument, experiment directory and file name. The outDir argument allows us to declare where the file will be downloaded to. In the example below, the file will be transfered to the current working directory.

transferFile(grover_client,
             'Thermo-Exactive',
             'Experiment_1',
             'QC01.raw',
             outDir = '.')

Similarly, we can transfer an entire directory:

transferDirectory(grover_client,
                  'Thermo-Exactive',
                  'Experiment_1',
                  outDir = '.')

Sample information

Thermo .raw mass spectrometry data files contain sample meta information within the file headers. This can be extracted and retrieved, in the form of a tibble, for a given file using:

sampleInfo(grover_client,'Thermo-Exactive','Experiment_1','QC01.raw')
#> 
QC01.raw 
[32m✔
[39m
#> 
[38;5;246m# A tibble: 1 × 39
[39m
#>   `RAW file` RAW file …¹ Creat…² Opera…³ Numbe…⁴ Descr…⁵ Instr…⁶ Instr…⁷ Instr…⁸
#>   
[3m
[38;5;246m<chr>
[39m
[23m      
[3m
[38;5;246m<chr>
[39m
[23m       
[3m
[38;5;246m<chr>
[39m
[23m   
[3m
[38;5;246m<chr>
[39m
[23m     
[3m
[38;5;246m<dbl>
[39m
[23m 
[3m
[38;5;246m<chr>
[39m
[23m   
[3m
[38;5;246m<chr>
[39m
[23m   
[3m
[38;5;246m<chr>
[39m
[23m   
[3m
[38;5;246m<chr>
[39m
[23m  
#> 
[38;5;250m1
[39m QC01.raw   64          04/25/… Thermo        2 
[38;5;246m"
[39m
[38;5;246m"
[39m      Exacti… Thermo… C:/Xca…
#> 
[38;5;246m# … with 30 more variables: `Serial number` <chr>, `Software version` <chr>,
[39m
#> 
[38;5;246m#   `Firmware version` <chr>, Units <chr>, `Mass resolution` <chr>,
[39m
#> 
[38;5;246m#   `Number of scans` <dbl>, `Number of ms2 scans` <dbl>, `Scan range` <dbl>,
[39m
#> 
[38;5;246m#   `Time range` <dbl>, `Mass range` <dbl>, `Scan filter (first scan)` <chr>,
[39m
#> 
[38;5;246m#   `Scan filter (last scan)` <chr>, `Total number of filters` <chr>,
[39m
#> 
[38;5;246m#   `Sample name` <chr>, `Sample id` <chr>, `Sample type` <chr>,
[39m
#> 
[38;5;246m#   `Sample comment` <chr>, `Sample vial` <chr>, `Sample volume` <chr>, …
[39m

Similarly, the sample information for an entire experiment run can be retrieved with:

runInfo(grover_client,'Thermo-Exactive','Experiment_1')
#> 
#> Genrating run info table for Experiment_1 containing 1 .raw files
#> # A tibble: 1 × 39
#>   `RAW file` RAW file …¹ Creat…² Opera…³ Numbe…⁴ Descr…⁵ Instr…⁶ Instr…⁷ Instr…⁸
#>   <chr>      <chr>       <chr>   <chr>     <dbl> <chr>   <chr>   <chr>   <chr>  
#> 1 QC01.raw   64          04/25/… Thermo        2 ""      Exacti… Thermo… C:/Xca…
#> # … with 30 more variables: `Serial number` <chr>, `Software version` <chr>,
#> #   `Firmware version` <chr>, Units <chr>, `Mass resolution` <chr>,
#> #   `Number of scans` <dbl>, `Number of ms2 scans` <dbl>, `Scan range` <dbl>,
#> #   `Time range` <dbl>, `Mass range` <dbl>, `Scan filter (first scan)` <chr>,
#> #   `Scan filter (last scan)` <chr>, `Total number of filters` <chr>,
#> #   `Sample name` <chr>, `Sample id` <chr>, `Sample type` <chr>,
#> #   `Sample comment` <chr>, `Sample vial` <chr>, `Sample volume` <chr>, …

Raw file conversion to .mzML format

With grover it is also possible retrieve .mzML format data files, converted from the .raw files. This file conversion uses the command line tool msconvert, implemented in R by the msconverteR package.

File conversion

To retrieve the example .raw file in .mzML format, the convertFile() function can be used. This takes similar inputs as transferFile() shown previously.

convertFile(grover_client,
            'Thermo-Exactive',
            'Experiment_1',
            'QC01.raw',
            outDir = '.')
convertDirectory(grover_client,
            'Thermo-Exactive',
            'Experiment_1',
            outDir = '.')

Conversion arguments

The args argument can be supplied to these conversion functions pass specific conversion criterial to msconvert. The grover package contains a number of helper functions to simplify their use. The available functions are listed below.

conversionArgsMSlevel1
conversionArgsMSlevel2
conversionArgsMSlevel3
conversionArgsNegativeMode
conversionArgsPeakPick
conversionArgsPositiveMode

Calling these functions return the appropriate string argument that is to be passed to msconvert.

conversionArgsPeakPick()
#> [1] "peakPicking true 1-"

Mutiple functions can also be combined.

paste(conversionArgsPeakPick(),conversionArgsNegativeMode())
#> [1] "peakPicking true 1- polarity negative"

A full list of the available msconvert arguments can be found here. The example below shows the use of the conversionArgsPeakPick() to retrieve a centroided data in .mzML format.

convertFile(grover_client,
            'Thermo-Exactive',
            'Experiment_1',
            'QC01.raw',
            args = conversionArgsPeakPick(), 
            outDir = '.')

Conclusion

To conclude, we can terminte the process running the example grover API instance.

api$kill()
#> [1] TRUE