The idea here is that we want to describe how to build a “context” and then evaluate one or more expressions in it. This is a little related to approaches like docker
and packrat
in that we want contexts to be isolated from one another, but different in that portability is more important than isolation.
Imagine that you have an analysis to run on another computer with:
drat
, bioconductor, etc).The other computer may already have some packages installed, so you don’t want to waste time and bandwidth re-installing them. So things end up littered with constructs like
if (!require("mypkg")) {
install.packages("mypkg")
library(mypkg)
}
If these packages are coming from GitHub (or worse also have dependencies on GitHub) the bootstrap code gets out of hand very quickly and tends to be non-portable.
Creating separate libraries (rather than sharing one from your personal computer) will be important if the architecture differs (e.g., you run Windows but you want to run code on a Linux cluster).
The idea here is that context
helps describe a context made from the above ingredients and then attempts to recreate it on a different computer (or in a different directory on your computer).
A minimal context looks like this:
path <- tempfile()
ctx <- context::context_save(path = path)
#> [ init:id ] 7fc79fad6bfb6e3adf37878447caf1e2
#> [ init:db ] rds
#> [ init:path ] /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5
#> [ save:id ] 50bd72bb90fdb9422a0347fcb4f56b6e
#> [ save:name ] skarn_caiman
ctx
#> <context>
#> - packages: list(attached = character(0), loaded = character(0))
#> - root_id: 7fc79fad6bfb6e3adf37878447caf1e2
#> - id: 50bd72bb90fdb9422a0347fcb4f56b6e
#> - name: skarn_caiman
#> - root: list(id = "7fc79fad6bfb6e3adf37878447caf1e2", path = "/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5", db = <environment>)
#> - db: <environment>
Typically one would use the arguments packages
and sources
to describe the requirements of any tasks that you’ll be running.
Once a context is defined, tasks can be defined in the context. These are simply R expressions associated with the identifier of a context.
The task t
above is just a key that can be used to retrieve information about the task later.
context::task_expr(t, ctx)
#> sin(1)
Several such tasks may exist, though here only one does
context::task_list(ctx)
#> [1] "2800a2d77fe5c0a89a1d14af097e0284"
To run a task we first need to “load” the context (this will actual load any required packages and source any scripts) then pass this through to task_run
res <- context::task_run(t, context::context_load(ctx))
#> [ context ] 50bd72bb90fdb9422a0347fcb4f56b6e
#> [ library ]
#> [ namespace ]
#> [ source ]
#> [ root ] /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5
#> [ context ] 50bd72bb90fdb9422a0347fcb4f56b6e
#> [ task ] 2800a2d77fe5c0a89a1d14af097e0284
#> [ expr ] sin(1)
#> [ start ] 2023-05-25 15:02:47.937252
#> [ ok ]
#> [ end ] 2023-05-25 15:02:47.946731
This prints the result of restoring the context and running the task:
context
: the context idlibrary
: calls to library()
to load packages and attach namespacesnamespace
: calls to loadNamespace()
; these packages were present but not attached in the context.source
: There was nothing to source()
here so this is blank, otherwise it would be a list of filenames.root
: the directory within which all our context/task files will be locatedcontext
: this is repeated here because we’ve finished the load part of the aove statementtask
: the task idexpr
: the expression to evaluatestart
: start timeok
: indication of successend
: end timeAfter all that, here is the result:
res
#> [1] 0.841471
The result can also be retrieved using task_result()
:
context::task_result(t, ctx)
#> [1] 0.841471
This is not immensely useful as it is; it’s just evaluation with more steps. Typically we’d do this in another process. You can do this with callr
here:
res <- callr::rscript(file.path(path, "bin", "task_run"), c(path, t),
echo = TRUE, show = TRUE)
#> Running /Library/Frameworks/R.framework/Resources/bin/Rscript \
#> /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5/bin/task_run \
#> /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5 \
#> 2800a2d77fe5c0a89a1d14af097e0284
#> [ hello ] 2023-05-25 15:02:48.522568
#> [ wd ] /Users/runner/work/context/context/vignettes
#> [ init ] 2023-05-25 15:02:48.531
#> [ hostname ] Mac-1685026763345.local
#> [ process ] 4675
#> [ version ] 0.5.0
#> [ open:db ] rds
#> [ context ] 50bd72bb90fdb9422a0347fcb4f56b6e
#> [ library ]
#> [ namespace ]
#> [ source ]
#> [ parallel ] running as single core job
#> [ root ] /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpgA4GJC/file11943a278ce5
#> [ context ] 50bd72bb90fdb9422a0347fcb4f56b6e
#> [ task ] 2800a2d77fe5c0a89a1d14af097e0284
#> [ expr ] sin(1)
#> [ start ] 2023-05-25 15:02:48.568951
#> [ ok ]
#> [ end ] 2023-05-25 15:02:48.576941
#>