Overview
This vignette outlines the steps required for hosting a web API for providing web access to a mass spectrometry data repository. This includes the repository structure, specifying the host configuration, running the API and request logging.
To get started we can first load the grover
package.
Data repository structure
The structure of the data repository should consist of a hierarchy of directories within a central data repository directory. The structure is outlined below.
[34mrepository
[39m
└──
[34minstrument
[39m
└──
[34mexperiment
[39m
└──
[32msample.raw
[39m
Within the central data repository directory are instrument
directories specifying the mass spectrometers from which the data were
generated. Within these instrument directories, the data should be
organised into directories for each experiment. The experiment
directories should then contain the .raw
mass spectrometry
data files and any other associated files such meta information or
instrument methods.
Configuration
In order to activate the API, we first need to specify the host details. This can be done in two ways. The primary method is through the use of a configuration file that can then be parsed and specified when the API is activated. This should be in YAML format and have the structure shown below:
host: 127.0.0.1
port: 8000
auth: 1234
repository: ./data
This specifies the host address, the port, and authentication key for data security and the host system directory path to the data repository.
The package contains an example file that we can parse here using the
readGrover()
function.
grover_host <- readGrover(system.file('grover_host.yml',package = 'grover'))
This returns an S4 object of class GroverHost
that
contains the API host information. The host information can be viewed by
printing the object:
print(grover_host)
#>
#> Grover Information
#>
#> Host: 127.0.0.1
#> Port: 8000
#> Authentication: 1234
#> Repository: ./data
The second method, and the method that will be used for this example,
is to specify the host details directly using the grover()
function like the following:
grover_host <- grover(host = "127.0.0.1",
port = 8000,
auth = "1234",
repository = system.file('repository',
package = 'grover'))
This enables us to host the example data repository included in the package.
Running the API
The API can be activated using groverAPI()
. For the
purposes of this example, the background
argument will also
be specified as TRUE
to enable the API to be run in a
background process. This will enable us to interact with the API without
having to move to an alternative R console. Run the following to start
the API:
api <- groverAPI(grover_host,
background = TRUE,
log_dir = paste0(tempdir(),'/logs'),
temp_dir = paste0(tempdir(),'/temp'))
The log_dir
argument has also been specified. See the
final section for details on request logging by the API.
Running the API in the background returns a callr
package process
object. We can test the status of the
background process in which the API is running using:
api$is_alive()
#> [1] TRUE
To test the API from the client side, the extant()
function can be used to test if the API is live.
extant(grover_host)
#> [1] TRUE
For further details on client-side access to the hosted mass spectrometry data, see the Accessing raw mass spectrometry data from a grover API vignette.
Finally, the background API process can be terminated using:
api$kill()
#> [1] TRUE
Further details on options for securely deploying web APIs can be
found as part of the plumber
package documentation here.
Request logging
The log_dir
argument can also be found in the call to
groverAPI()
in the previous section. In this example, the
logs
directory within the temporary directory has been
specified. The snippet below can be used to identify the log files
generated by our API request.
logs <- list.files(paste0(tempdir(),'/logs'),full.names = TRUE)
logs
#> [1] "/tmp/RtmpG5uCr0/logs/grover_2023-02-20.log"
The contents of this log file can be access as below:
readLines(logs[1])
#> [1] "INFO [2023-02-20 15:06:04] GET /extant 1234 200 0.034"
The log entry above shows our single request using
extant()
whilst the API was live. This log entry also
includes, in order of appearance, the data and time of the request, the
request type, the requested function, the auth
argument
specified, the status of the request and the processing time of the
request.