Image processing

This example will introduce how to use the Girder Worker to build some simple image processing tasks using Pillow. We will learn how to chain several tasks together in a workflow and finally how to run these workflows both locally and through a remote worker.

Download and view an image

For our first task, we will download a png image from a public website and display it on the screen. We begin by defining a new task that will take a single image object and call its show method.

A task is a special kind of dictionary with keys inputs and outputs as well as other metadata describing how these objects will be used. In the case of simple Python scripts, they can be provided inline as we have done in this example. Each input and output spec in a task is a dict with the following keys:

name
The name designated to the datum. This is used both for connecting tasks together in a workflow and, in the case of Python tasks, the name of the variable injected into/extracted from the tasks scope.
type
The general data type expected by the task. See Types and formats for a list of types provided by the worker’s core library as well as Application Plugins for additional data types provided by optional plugins.
format
The specific representation or encoding of the data type. The worker will automatically convert between different data formats provided that they are of the same base type.
show_image = {
    'inputs': [{'name': 'the_image', 'type': 'image', 'format': 'pil'}],
    'outputs': [],
    'script': 'the_image.show()'
}

In order to run the task, we will need to provide an input binding that tells the worker where it can get the data to be injected into the port. Several I/O modes are supported; in this case, we provide a public URL to an image that the worker will download and open using Pillow. Notice that the worker downloads and reads the file as part of the automatic data format conversion.

lenna = {
    'type': 'image',
    'format': 'png',
    'url': 'https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png'
}

Finally to run this task, we only need to provide the task object and the input binding to girder_worker.tasks.run(). The object returned by this function contains data extracted and converted through the task’s output ports.

output = girder_worker.tasks.run(show_image, {'the_image': lenna})

Perform an image blur inside a workflow

Now that we know how to generate a simple task using the worker, we want to learn how to connect multiple tasks together in a workflow. The worker’s pythonic API allows us to do this easily. Let’s create a new task that performs a blur operation on an image. This might look like the following:

blur_image = {
   'inputs': [
      {'name': 'blur_input', 'type': 'image', 'format': 'pil'},
      {'name': 'blur_radius', 'type': 'number', 'format': 'number'}
   ],
   'outputs': [{'name': 'blur_output', 'type': 'image', 'format': 'pil'}],
   'script': """
from PIL import ImageFilter
blur_output = blur_input.filter(ImageFilter.GaussianBlur(blur_radius))
"""
}

Notice that this task takes an additional numeric input that acts as a parameter for the blurring filter. Connecting our show_image task, we can view the result of our image filter. First, we create a new workflow object from the girder_worker.core.specs module.

from girder_worker.core.specs import Workflow
wf = Workflow()

Next, we add all the tasks to the workflow. The order in which the tasks are added is insignificant because the worker will automatically sort them according to their position in the workflow.

wf.add_task(blur_image, 'blur')
wf.add_task(show_image, 'show')

Finally, we connect the two tasks together.

wf.connect_tasks('blur', 'show', {'blur_output': 'the_image'})

Running a workflow has the same syntax as running a single task.

output = girder_worker.tasks.run(
   wf,
   inputs={
      'blur_input': lenna,
      'blur_radius': {'format': 'number', 'data': 5}
   }
)
Blur image workflow
lenna lenna10

Using a workflow to compute image metrics

Finally, we will create a few more tasks to generate a more complicated workflow that returns some number of interest about an image. First, let’s create a task to subtract two images from each other.

subtract_image = {
    'inputs': [
        {'name': 'sub_input1', 'type': 'image', 'format': 'pil'},
        {'name': 'sub_input2', 'type': 'image', 'format': 'pil'}
    ],
    'outputs': [
        {'name': 'diff', 'type': 'image', 'format': 'pil'},
    ],
    'script': """
from PIL import ImageMath
diff = ImageMath.eval('abs(int(a) - int(b))', a=sub_input1, b=sub_input2)
"""
}

Now another task will compute the average pixel value of the input image.

mean_image = {
    'inputs': [
        {'name': 'mean_input', 'type': 'image', 'format': 'pil'},
    ],
    'outputs': [
        {'name': 'mean_value', 'type': 'number', 'format': 'number'},
    ],
    'script': """
from PIL import ImageStat
mean_value = ImageStat.Stat(mean_input).mean[0]
"""
}

Finally, let’s add all of the tasks to a new workflow and make the appropriate connections.

wf = Workflow()
wf.add_task(blur_image, 'blur1')
wf.add_task(blur_image, 'blur2')
wf.add_task(subtract_image, 'subtract')
wf.add_task(mean_image, 'mean')

wf.connect_tasks('blur1', 'subtract', {'blur_output': 'sub_input1'})
wf.connect_tasks('blur2', 'subtract', {'blur_output': 'sub_input2'})
wf.connect_tasks('subtract', 'mean', {'diff': 'mean_input'})

This workflow performs blurring operations on a pair of input images, computes the difference between them, and returns the average value of the difference. Let’s see how this works with our sample image. Notice that in this case, there is a conflict between the input port names of the two blur tasks. We must specify which port we are referring to by prefixing the port name with the task name.

output = girder_worker.tasks.run(
   wf,
   inputs={
      'blur1.blur_input': lenna,
      'blur1.blur_radius': {'format': 'number', 'data': 1},
      'blur2.blur_input': lenna,
      'blur2.blur_radius': {'format': 'number', 'data': 8},
   }
)
print output['mean_value']['data']