Jobs Service

From Ocean Framework Documentation Wiki
(Redirected from Jobs)
Jump to: navigation, search

The Jobs service provides a mechanism for running asynchronous jobs of various kinds.

Asynchronous jobs are especially useful for processing updates of a time consuming nature, or jobs that are resource intensive or which depend on external services which may or may not be available. Typical use cases include uploads, sending email, or booking tickets. Normally, jobs are concerned with modifying data, but there are also use cases involving time-consuming search operations.

Jobs may be left to execute and terminate in their own time, but they can also be polled for status during their execution. This allows a browser service consumer to check the progress of a particular job without incurring any wait states; furthermore, jobs are implemented and aggressively cached in such a way that frequent polling does not place any increased burden on the resource servers. In fact, in this architecture, polling from web clients is to be encouraged.

The Job service is entirely AWS based: it uses AWS Simple Queue Service to handle job queues and AWS DynamoDB to store job descriptions. This means the Job service will scale massively and without limit.

The service is intended for handling requests to local Ocean services. Interfacing to external systems is better done by creating encapsulating local services. The job service handles authentication transparently, and also honours redirects.

Jobs consist of multi-step REST HTTP JSON requests. As jobs are cached in Varnish just like everything else, they can be aggressively polled for progress and status without increasing server load.

The Jobs service also handles retries, error states and poison jobs in a controlled way.

Each job step is guaranteed to run at least once, as there always will be unavoidable edge cases where job steps may be retried. These are extremely rare, however. For instance, an external service may be down, in which case the Job service will retry the step. For this reason, job steps should be idempotent or be insensitive to multiple execution and/or restarts. In some cases, true idempotency isn't needed: for instance, in most cases it's immaterial whether a user receives an email message twice.

A web application written to take full advantage of asynchronous processing using the Jobs service

  • permits high scalability,
  • allows the application to be decoupled across service layers,
  • creates a consistently responsive user experience for the human web client user, regardless of the amount of processing in other service tiers or in third-party services,
  • makes it possible to automatically scale the server infrastructure depending on actual workload (queue length) rather than just processor load.

We encourage you to write your applications in an asynchronous manner as far as possible. Asynchronous processing increases availability, reliability and scalability, and it also enhances the user experience of the system.

Bottle.jpg NOTE: Only local services should be permitted to create and delete AsyncJobs. External clients may safely be given the right to poll an existing AsyncJob, but should always go through an Ocean service to create and delete them.

Job Flow

Jobs are created by making POST requests to the Jobs service which enqueues them using a high-speed, highly scalable persistent AWS messaging bus for processing by a pool of asynchronous workers. The POST request returns immediately with a new AsyncJob resource representing the job.

A conditional GET request to the self hyperlink of this resource representation will poll Ocean for any changes in the job's status without placing any further load on the servers themselves. (See Conditional GETs.)

The asynchronous workers will pick jobs from the queue, authenticate using the provided credentials if necessary, and then execute the steps one by one. As each step is completed, the AsyncJob resource representation will be updated for the benefit of polling service consumers.


The AsyncJob Resource
an asynchronous, multi-step job.
The CronJob Resource
CronJobs are AsyncJobs that run periodically.
The ScheduledJob Resource
ScheduledJobs are AsyncJobs that run once, at a particular point in time.