Skip to contents

Create a bulk set of tasks. Variables in data take precedence over variables in the environment in which expr was created. There is no "pronoun" support yet (see rlang docs). Use !! to pull a variable from the environment if you need to, but be careful not to inject something really large (e.g., any vector really) or you'll end up with a revolting expression and poor backtraces. We will likely change some of these semantics later, be careful.

Usage

task_create_bulk_expr(
  expr,
  data,
  environment = "default",
  bundle_name = NULL,
  driver = NULL,
  resources = NULL,
  envvars = NULL,
  parallel = NULL,
  root = NULL
)

Arguments

expr

An expression, as for task_create_expr

data

Data that you wish to inject row-wise into the expression

environment

Name of the hipercow environment to evaluate the task within.

bundle_name

Name to pass to hipercow_bundle_create when making a bundle. If NULL we use a random name. We always overwrite, so if bundle_name already refers to a bundle it will be replaced.

driver

Name of the driver to use to submit the task. The default (NULL) depends on your configured drivers; if you have no drivers configured no submission happens (or indeed is possible). If you have exactly one driver configured we'll submit your task with it. If you have more than one driver configured, then we will error, though in future versions we may fall back on a default driver if you have one configured. If you pass FALSE here, submission is prevented even if you have no driver configured.

resources

A list generated by hipercow_resources giving the cluster resource requirements to run your task.

envvars

Environment variables as generated by hipercow_envvars, which you might use to control your task. These will be combined with the default environment variables (see vignettes("details"), this can be overridden by the option hipercow.default_envvars), and any driver-specific environment variables (see vignette("windows")). Variables provided here have the highest precedence. You can unset an environment variable by setting it to NA.

parallel

Parallel configuration as generated by hipercow_parallel, which defines which method, if any, will be used to initialise your task for parallel execution.

root

A hipercow root, or path to it. If NULL we search up your directory tree.

Value

A hipercow_bundle object, which groups together tasks, and for which you can use a set of grouped functions to get status (hipercow_bundle_status), results (hipercow_bundle_result) etc.

See also

hipercow_bundle_wait, hipercow_bundle_result for working with bundles of tasks

Examples

cleanup <- hipercow_example_helper()
#>  This example uses a special helper

# Suppose we have a data.frame:
d <- data.frame(a = 1:5, b = runif(5))

# We can create a "bundle" by applying an expression involving "a"
# and "b":
bundle <- task_create_bulk_expr(sqrt(a * b), d)
#>  Submitted 5 tasks using 'example'
#>  Created bundle 'blithesome_flyingfox' with 5 tasks

# Once you have your bundle, interact with it using the bundle
# analogues of the usual task functions:
hipercow_bundle_wait(bundle)
#> [1] TRUE
hipercow_bundle_result(bundle)
#> [[1]]
#> [1] 0.9632601
#> 
#> [[2]]
#> [1] 0.2307582
#> 
#> [[3]]
#> [1] 1.295559
#> 
#> [[4]]
#> [1] 1.84843
#> 
#> [[5]]
#> [1] 0.6159669
#> 

cleanup()
#>  Cleaning up example