Custom Task Types With BenchPress

Time being the limited resource that it is, it took a little while to wrap up, but BenchPress is now open source.

BenchPress is intended to be able to be able to represent many different types of payloads via simple JSON configuration, but the project is still new and it doesn’t (yet) have a lot of flexibility in terms of what users can do with the existing task definition language. Fortunately, it’s pretty straightforward to make your own custom task types, so in this post I’ll show how to make a “hello world” custom task type. You can also check out the sample code on GitHub.

BenchPress basics

The basic structure of the JSON you submit to the job controller is simple.

{
    "task": {
        "type": "HELLO-WORLD",
        "config": {
            # whatever you want
        }
    }
}

The config can be any JSON you wamt for your task type. The type is a semi-magical string that is used to identify a few classes that comprise a specific type of task; you’ll see how that string is used later.

TaskFactory and friends

I’ll go from the bottom up to explain the task execution structure. There are two types of nodes in BenchPress: worker and controller. Typically there is only one controller, but theoretically there could be many if you want. A job is submitted to the controller, which splits its sole task among the available workers. Each worker gets its own partition of the overall work.

Fundamentally, the work that a worker does is just a collection of Runnable instances. The Runnables are made on each worker by a TaskFactory instance. This is the relevant method of the TaskFactory interface:

Collection<Runnable> getRunnables(UUID jobId, int partitionId, UUID workerId,
    TaskProgressClient taskProgressClient, AtomicInteger reportSequenceCounter)
    throws IOException;

The method parameters represent the generic information available to every task — its parent job id, the id of the worker it’s running on, and some necessities for reporting progress back to the controller. In order to keep TaskFactory simple, the work of creating a TaskFactory has been pushed off to another interface, the TaskFactoryFactory (which is sure to drive Joel Spolsky nuts). The TaskFactoryFactory’s job is to create a TaskFactory given the JSON config, so its sole method is simply this:

TaskFactory getTaskFactory(ObjectReader objectReader, JsonNode configNode)
    throws IOException;

It’s up to you to read whatever you want out of the JSON and construct your flavor of TaskFactory.

TaskPartitioner

The JSON here is the JSON pertaining to one individual worker’s partition of the overall work. Since the task JSON is specific to each task type, the code to split up the original task into the per-worker partitions must necessarily be provided by the task type as well. So, we have the TaskPartitioner interface:

List<Partition> partition(UUID jobId, int workers, String progressUrl, String finishedUrl,
    ObjectReader objectReader, JsonNode configNode, ObjectWriter objectWriter) throws IOException;

The workers param is how many workers the task should be split for. The two URLs are needed to create a Partition, and the ObjectReader, JsonNode and ObjectWriter params let the implementation deserialize its configuration info, split as desired, and re-serialize.

Hooking up a custom task type

BenchPress needs to know which TaskFactoryFactory and TaskPartitioner to hand the config JSON to based on the contents of the type JSON field. The way this is done is with the com.palominolabs.benchpress.job.id.Id annotation and Guice multibindings. Annotate your TaskFactoryFactory and TaskPartitioner implementations (which might be just one class):

@Id("HELLO-WORLD")
final class HelloWorldTaskFactoryFactory implements TaskFactoryFactory {
...

@Id("HELLO-WORLD")
final class HelloWorldTaskPartitioner implements TaskPartitioner {
...

and add the Guice bindings:

public final class HelloWorldModule extends AbstractModule {
    @Override
    protected void configure() {
        Multibinder.newSetBinder(binder(), TaskFactoryFactory.class)
            .addBinding().to(HelloWorldTaskFactoryFactory.class);
        Multibinder.newSetBinder(binder(), TaskPartitioner.class)
            .addBinding().to(HelloWorldTaskPartitioner.class);
    }
}

Note that this means that since your classes are instantiated by Guice, you are free to use @Inject on your TaskFactoryFactory and TaskPartitioner constructors if you need anything beyond the provided ObjectReader, etc.

Finally, you’ll need to tell BenchPress to use your custom module. You can do so by adding the jar for your custom code to the lib directories in the worker and controller tarballs and starting with an extra system property that is set to a comma-separated list of extra module names:

-Dbenchpress.plugin.module-names=com.foo.benchpress.helloworld.HelloWorldModule

Since both the controller and worker need the custom code (for the TaskPartitioner and TaskFactoryFactory, respectively), you’ll need to do this for both services.

Once that’s all done, you should be able to submit your job JSON to the controller and have it work. In the case of the sample “HELLO-WORLD” task type, you should see a logging message like this:

2012-08-17 14:32:43,111 [pool-5-thread-2] INFO  MDC[] c.p.b.e.h.HelloWorldTaskFactory - Greeting: Hello, world!

Posted by Marshall Pierce

Marshall specializes in highly tuned and immensely scalable web and mobile applications. Experienced in front-end web and iOS development, he constantly pushes the boundaries of the latest browsers and mobile platforms. He splits his time with back-end development, where he is considered a domain expert in Java concurrency, distributed systems, systems design, and network security. Prior to co-founding Palomino Labs, Marshall was director of software development at Ness Computing where he led their initial launch. Before Ness, Marshall was a senior software developer at Genius.com, where he built the best-in-class integration with Salesforce.com.

About Palomino Labs

Palomino Labs unlocks the potential of software to change people and industries. Our team of experienced software developers, designers, and product strategists can help turn any idea into reality.

See the Palomino Labs website for more information, or send us an email and let's start talking about how we can work together.