We're doing HTTP right!

Alright, maybe we're not. We take the stance that we like how we're approaching HTTP. I gave a talk on our way to approach HTTP at PyGrunn 2012 tiled “I'm doing HTTP wrong” and shortly afterwards I accidentally uploaded an unfinished draft of an article to my personal blog about the same topic.

Since it did cause quite some confusion on hackernews, especially because the article on my blog was incomplete and not intended for publishing. So this is the second attempt to explain what we're doing and why.

Traditional Approach to HTTP

If you look at how most applications are dealing with HTTP there are typically two very common ways: being very closely tied to HTTP and staying far away from it. The former would be your average web framework. You have some sort of request object that encapsulates the incoming HTTP request and you create a response object and return that to the user agent again. The latter case would be abominations like XMLRPC that basically reduce HTTP to a bare transport layer. By doing that they get rid of a whole bunch of really cool features that HTTP provides such as caching, transfer encodings, content negotiation and other stuff.

What we did Different

When we started out creating our service we did not really have HTTP as a huge target in mind. We started with the idea that we want to have a very lightweight transport format that follows certain schemas. Basically serializing a struct into a binary blob and the transport system of choice was a basic TCP connection.

When we had the basic implementation of that in place it became however quite clear that for the first game to use this service (RAD Soldiers) HTTP is actually a pretty good choice and that HTTP could work in other places as well. Since we already had the schema/TCP based communication in place we then added HTTP support on top of that which turned out to be a pretty enlightening process.

We were never exposing methods, always resources and ways to manipulate them — REST basically. Now all our functionality is internally structured in a way that even without executing any request specific code we can already figure out what will happen. We read the HTTP headers in and based on the headers and HTTP request line alone we figure out what's supposed to be happen (request method + URL + headers is all we need). With that information we find the handling function in the code base and look at the meta data associated with it.

This meta information tells us what semantics the function is following and what kind of data it wants. This allows us to accept any arbitrary incoming format and any outgoing one. We can transparently accept URL encoded data, JSON data, binary blobs (even without keys!) and other things. Likewise we know based on the semantics that are defined for the function what status codes to emit, how to signal errors and more.

Freebies

The best part of this however is that we solve a whole bunch of potential problems basically for free. For instance one of the problems that got popularized last year again is that hash tables can potentially be exploited by triggering a whole bunch of collisions to force them to become linked lists and exhibit horrible runtime performance. With our system that never becomes a problem because we already know which kind of data we're not interested in, so we can reject it while parsing the JSON/URL encoded data. This also gives us very predictable memory usage on the workers because we never have to read in things we know that will never be relevant.

Additionally once the data is sent to the function it's reduced to a Python dictionary with other Python primitives in it that can be passed around easily. The function can execute entirely independent of HTTP which allows us to pass this data around without having to care too much about it.

One of the things that are quite common in Python frameworks especially is the idea of a middleware component that modifies responses to do common things. This sometimes involves buffering streamed data. Doing that sort of stuff on the WSGI layer has the downside that WSGI by design can be either buffered or streamed but you have to support either. If it's a streamed response you can't tell in advance how much memory will be wasted if you start to actually buffer up the whole thing. What's even worse is that some techniques (like long polling) depend on never buffering up responses. WSGI has no signalling system that would indicate to the WSGI middleware if it's actually allowed to force buffering or not.

REST is cool, HTTP is lacking

If you use HTTP APIs they are always situations where you run into some things that are not solved on HTTP. For instance signalling errors on the HTTP layer is basically one of a few numbers but in order to figure out what's really happening you need to look at the response body. The nice thing about using declarative HTTP is that we can support our own error codes, error names and descriptions as well as the proper HTTP status code at the same time. As a user of the API you probably want to check if the HTTP status code does not indicate success and in that case look at the body to figure out the more descriptive exception. If you are a proxy you ignore the body entirely and just act upon the HTTP status code.

No Streaming

Now obviously if you have this in mind it becomes quite clear that we're not doing streaming with that kind of API. Not as in we can't stream to the client or we can't deal with a client that streams stuff in, but we don't ever pass partial content through to the client. The streaming happens on the actual HTTP server. So what do we do with large responses too large to fit into our worker memory? The easy answer is that we don't handle that through that type system. Large responses are limited by design to things like file uploads that we're handling separately.

Why do we do that? Because the whole point of the exercise is to make a unified API usage experience, not just on our part, also on the client library's part. For both sides the problem is reduced because we know that all data that is transmitted is generally quite small. No special logic required to deal with the problem.

The other problem with this is that it's basically limited to APIs. You can't run a traditional web application that renders out HTML on this kind of stuff. In theory it could work, but at least our implementation is just not designed for this kind of stuff.

If you want to emit a whole bunch of data that is created on the fly and generally following some sort of schema, you can send it out as chunks of data on top of a websocket connection for instance.

Traditional Web Applications

For traditional web applications that render HTML and process for data this design does not work. There are two answers we have to this: the first one is that you can have a separate code path if you render out HTML. That's basically how we implement the authorization platform and the user settings at the moment. The second is that soon (and maybe already!) you can just write the user interface in JavaScript on top of such an HTTP based API.

Too Long; Didn't Read

We started out writing a schema based, custom TCP protocol. We put it on HTTP, started supporting different formats and found out that this is actually a pretty awesome way to do HTTP based APIs.

The slides from the presentation at PyGrunn can be found on our speakerdeck.