Using Girder Worker with Girder

The most common use case of Girder Worker is running processing tasks on data managed by a Girder server. Typically, either a user action or an automated process running on the Girder server initiates the execution of a task that runs on a Girder Worker.

The task to be run must be installed in both the Girder server environment as well as the worker environment. If you are using a built-in plugin, you can just install girder-worker on the Girder server environment. If you’re using a custom task plugin, pip install it on both the workers and the Girder server environment.

Running tasks as Girder jobs

Once installed, starting a job is as simple as importing the task into the python environment and calling delay() on it. The following example assumes your task exists in a package called my_worker_tasks:

from my_worker_tasks import my_task

result = my_task.delay(arg1, arg2, kwarg1='hello', kwarg2='world')

Here the result variable is a celery result object with Girder-specific properties attached. Most importantly, it contains a job attribute that is the created job document associated with this invocation of the task. That job will be owned by the user who initiated the request, and Girder worker will automatically update its status according to the task’s execution state. Additionally, any standard output or standard error data will be automatically added to the log of that job. You can also set fields on the job using the delay method kwargs girder_job_title, girder_job_type, girder_job_public, and girder_job_other_fields. For instance, to set the title and type of the created job:

job = my_task.delay(girder_job_title='This is my job', girder_job_type='my_task')
assert job['title'] == 'This is my job'
assert job['type'] == 'my_task'

The Girder Job details page can show a dictionary of metadata passed in the meta field of the girder_job_other_fields:

job = my_task.delay(girder_job_title='This is my job', girder_job_type='my_task', girder_job_other_fields={'meta': {'special_key': 'Special Value'}})

Downloading files from Girder for use in tasks

Note

This section applies to python tasks, if you are using the built-in docker_run task, it has its own set of transforms for dealing with input and output data, which are detailed in the The docker_run Task documentation

The following example makes use of a Girder Worker transform for passing a Girder file into a Girder Worker task. The girder_worker_utils.transforms.girder_io.GirderFileId transform causes the file with the given ID to be downloaded locally to the worker node, and its local path will then be passed into the function in place of the transform object. For example:

from girder_worker_utils.transforms.girder_io import GirderFileId

def process_file(file):
    return my_task.delay(input_file=GirderFileId(file['_id'])).job