Overview

From Ocean Framework Documentation Wiki
Jump to: navigation, search

Structure

Ocean is written for the cloud and comes with full support for deployment on Amazon AWS. Structurally, an Ocean environment looks like this:

OceanStructure.png

The above is just one of the different Ocean environments. It's very likely you'll be using more than one.

The networking layer for an Ocean environment looks like this:

OceanEnv.png

The outermost layer, facing the world, is accessed exclusively via HTTPS. This includes end-user client sites presented by browsers as well as external services using the API. All external requests to Ocean are terminated by Amazon's SSL terminators.

Next, all external requests pass through a high-capacity cache Varnish layer, redundantly deployed and with automatic failover designed for uninterrupted traffic. It serves as an enterprise-level reverse proxy cache for all Ocean requests. Varnish also massages HTTP header info to handle CORS preflight requests and allow browsers to access the API in exactly the same way as a remote service.

Next, requests pass through an Application Load Balancer which directs the request to the appropriate auto-scaling group of microservice instances. All Ocean API requests are individually authenticated and authorised by each microservice instance per URL and HTTP method, that is on the atomic level. Each service validates authorisation with the Auth service automatically before permitting the action to proceed.

Authorisation is just one example of when a service needs to call another service. In Ocean, cross-service API calls are always done via the external load balancer. This means that internal calls are also cached, which is extremely important for the programming model and for high performance. From a security perspective it should be pointed out that a service instance never directly connects to another instance. Service instances are in fact completely isolated from each other on the networking level.

Everything is consistently cached. This means that authorisation is extremely fast, and that consistent aggressive caching can be provided to user-written Ocean Ruby on Rails applications and services. To support this, Varnish has been programmed to support PURGE and BAN HTTP requests from the local networks to allow Ocean transparently to expire resources and their collections in the Varnish cache as the underlying resources change.

Thus, the Ocean cache can be treated authoritatively, and the data model gracefully extend to the whole of the World Wide Web.

Any SQL databases needed by the services are also individually deployed to Amazon RDS DB services. Each environment builds its own DynamoDB tables dynamically. as well as any other resources required.

Authentication, which uses BCrypt, uses only DynamoDB which means logins scale without limit.

Each service instance is connected via a resilient ZeroMQ transport to the Log Service, which uses a hosted Redis Elasticache cluster for log aggregation purposes. Logs are collected via Kinesis live streams to CloudSearch and are archived in S3 and, eventually, Glacier.

Full exposure

All services are fully exposed to the Internet. Therefore, they are all individually deployed, load-balanced, and flooding protected. It also means that every access, even if it’s a purely local one, requires authentication and authorisation.

Webservers, serving the front end application and databaseless, as they are also exposed to the web, via HTTPS or plain HTTP, are also deployed, load-balanced, and flooding protected in the same manner. All accesses web clients make to the back end Ocean services are of course also authenticated and authorised. For maximum caching efficiency and speed, such authorisations can be shared between all clients authenticating as a particular user (of which the specific application client is one).

REST

All APIs are REST APIs. The term “representational state transfer”, REST, was introduced and defined in 2000 by Roy Fielding in his doctoral dissertation. Fielding is one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification versions 1.0 and 1.1.

REST is a set of conventions for manipulating resources over HTTP which simplifies API complexity dramatically. It allows programmers to use remote stateful resources in an object-oriented fashion. REST APIs are centered around resources of various kinds (such as a user, a shopping cart, a game player inventory, a stream of news items, a live TV transmission, etc) and therefore enable programmers to work at a higher level of abstraction than that promoted by purely imperative RPC calls. Abstraction is important in complex systems.

Ocean uses REST in its fullest sense. Our implementation follows the HATEOAS hyperlink convention, without which REST merely would be RPC. Fielding puts it quite categorically: “REST interfaces must be hypermedia-driven in order to be called REST at all”. Thus, Ocean implements Level 3 of the Richardson maturity model.

With HATEOAS hyperlinks in every object enumerating everything that can be done with it in its current state, client and server are almost completely decoupled from each other. The client never constructs any follow-up URIs, it simply follows hyperlinks in resource representations. The HATEOAS constraint serves to decouple client and server in a way that allows the server to evolve functionality independently, without any modifications necessary to the client.

Clients only need to know where to go to obtain the basic resource; from that point, the resource representation itself contains the next steps. Clients don’t need to know any conventions for composing URIs. Such conventions aren’t needed, in fact, they are actively shunned as being out-of-band information. In fact, REST is not in any way about URI conventions – it’s about the inherent dynamics of hypermedia.

The purpose of some of the strictness of this and other REST constraints, Fielding explains, is "software design on the scale of decades: every detail is intended to promote software longevity and independent evolution. Many of the constraints are directly opposed to short-term efficiency."

Pearl.jpeg TIP: This video explains the principles of REST and HATEOAS in an informal way. You should watch it, especially if you have a background in Java or C#. It's an excellent explanation of all basic principles of massively scalable, reliable distributed web systems. That's the application domain for which Ocean is designed.

API Versioning

All Resources are versioned. The version number of each Resource is visible in its URL. All clients should specify the exact version of each Resource when first requesting them from the server. After sending the request to create, obtain, update or delete a Resource, the client should completely ignore the version of the Resource, as the API may transparently upgrade the Resource to a later version.

Thus, you may request a Resource of version v1 but receive the Resource in v8 or any later version. Resources are guaranteed to be compatible. Similarly, you may PUT a Resource of a certain version but receive the same updated object in a more recent version format. In this way, the API is able to evolve transparently. At any point in time, resources of later versions than available at the time of writing a client can, may, and will float around your data structures.

This is another reason for the oft repeated rule in these docs to never draw any conclusion from the structure of any URL you are given by the API. It may change without any notice. Don't build URLs except for the public ones.

Bottle.jpg NOTE: Since the resource version is part of the URI, different versions of the same resource are guaranteed to be cached separately. Were the version to be stored in an HTTP header, some public caches would confuse resources of different versions.

JSON

JSON is used as the data format for server resource representations, parameters, message bodies, etc. JSON is much more compact and lightweight than XML, and it’s quicker to parse and construct. It’s also a built-in data type in Javascript, which simplifies things for front-end web clients.

In earlier SOAs of the more massively enterprisey kind, XML was the norm. Nowadays, as the term SOA has acquired a wider meaning, XML is increasingly becoming a legacy data format in favour of the more modern JSON, which is now the most common data format for web service data exchange.

UTF-8

UTF-8 is used throughout. Services should convert to and from UTF-8 as necessary when interfacing to external or legacy systems using other encodings.

HTTPS

All services are exposed to the world via HTTPS. Internally, between individual services, HTTP is used. The architecture makes full use of Amazon's elastic load balancing and SSL termination facilities.

Amazon VPC

  1. Amazon VPC gives complete control over the infrastructure, meaning the size, properties and number of server instances can be adjusted to fit the actual needs, as can network properties. This reduces turn-around times and keeps hosting costs at a minimum.
  2. Amazon offers dynamic, automatic scaling of web server capacity according to the number of incoming requests. During high customer traffic peaks, the Amazon infrastructure can seamlessly deploy, and load-balance, as many extra server instances as required to keep up with the amount of traffic. When traffic drops, instances are stopped. This reduces costs in a way not possible when renting physical server blades at a hosting firm.
  3. A VPC allows for secure networking. Ocean uses multiple subnets for different environments (such as production, staging, and development) and also for different types of instance functionality within each environment. Access thus is layered and restricted on the networking level. Besides being good practice, it's a requirement for PCI compliance.

Infrastructure as code

Ocean is a DevOps framework. Infrastructure is code. Chef is used to manage all infrastructure and to handle deployment and releases without any need to restart servers. Downtime should not occur. The Ocean pipeline is designed for continuous delivery and deployment.

New server instances are deployed with a single terminal command. Other Ocean servers will pick up the existence of a new instance and adjust their configurations accordingly. No downtime is required.