News
Cap'n Proto 0.8: Streaming flow control, HTTP-over-RPC, fibers, etc.
Today I’m releasing Cap’n Proto 0.8.
What’s new?
- Multi-stream Flow Control
- HTTP-over-Cap’n-Proto
- KJ improvements
- Lots and lots of minor tweaks and fixes.
Multi-stream Flow Control
It is commonly believed, wrongly, that Cap’n Proto doesn’t support “streaming”, in the way that gRPC does. In fact, Cap’n Proto’s object-capability model and promise pipelining make it much more expressive than gRPC. In Cap’n Proto, “streaming” is just a pattern, not a built-in feature.
Streaming is accomplished by introducing a temporary RPC object as part of a call. Each streamed message becomes a call to the temporary object. Think of this like providing a callback function in an object-oriented language.
For instance, server -> client streaming (“returning multiple responses”) can look like this:
Or for client -> server streaming, the server returns a callback:
Note that the client -> server example relies on promise pipelining: When the client invokes streamingCall()
, it does NOT have to wait for the server to respond before it starts making calls to the callback
. Using promise pipelining (which has been a built-in feature of Cap’n Proto RPC since its first release in 2013), the client sends messages to the server that say: “Once my call to streamingCall()
call is finished, take the returned callback and call this on it.”
Obviously, you can also combine the two examples to create bidirectional streams. You can also introduce “callback” objects that have multiple methods, methods that themselves return values (maybe even further streaming callbacks!), etc. You can send and receive multiple new RPC objects in a single call. Etc.
But there has been one problem that arises in the context of streaming specifically: flow control. Historically, if an app wanted to stream faster than the underlying network connection would allow, then it could end up queuing messages in memory. Worse, if other RPC calls were happening on the same connection concurrently, they could end up blocked behind these queued streaming calls.
In order to avoid such problems, apps needed to implement some sort of flow control strategy. An easy strategy was to wait for each sendChunk()
call to return before starting the next call, but this would incur an unnecessary network round trip for each chunk. A better strategy was for apps to allow multiple concurrent calls, but only up to some limit before waiting for in-flight calls to return. For example, an app could limit itself to four in-flight stream calls at a time, or to 64kB worth of chunks.
This sort of worked, but there were two problems. First, this logic could get pretty complicated, distracting from the app’s business logic. Second, the “N-bytes-in-flight-at-a-time” strategy only works well if the value of N is close to the bandwidth-delay product (BDP) of the connection. If N was chosen too low, the connection would be under-utilized. If too high, it would increase queuing latency for all users of the connection.
Cap’n Proto 0.8 introduces a built-in feature to manage flow control. Now, you can declare your streaming calls like this:
Methods declared with -> stream
behave like methods with empty return types (-> ()
), but with special behavior when the call is sent over a network connection. Instead of waiting for the remote site to respond to the call, the Cap’n Proto client library will act as if the call has “returned” as soon as it thinks the app should send the next call. So, now the app can use a simple loop that calls sendChunk()
, waits for it to “complete”, then sends the next chunk. Each call will appear to “return immediately” until such a time as Cap’n Proto thinks the connection is fully-utilized, and then each call will block until space frees up.
When using streaming, it is important that apps be aware that error handling works differently. Since the client side may indicate completion of the call before the call has actually executed on the server, any exceptions thrown on the server side obviously cannot propagate to the client. Instead, we introduce a new rule: If a streaming call ends up throwing an exception, then all later method invocations on the same object (streaming or not) will also throw the same exception. You’ll notice that we added a done()
method to the callback interface above. After completing all streaming calls, the caller must call done()
to check for errors. If any previous streaming call failed, then done()
will fail too.
Under the hood, Cap’n Proto currently implements flow control using a simple hack: it queries the send buffer size of the underlying network socket, and sets that as the “window size” for each stream. The operating system will typically increase the socket buffer as needed to match the TCP congestion window, and Cap’n Proto’s streaming window size will increase to match. This is not a very good implementation for a number of reasons. The biggest problem is that it doesn’t account for proxying: with Cap’n Proto it is common to pass objects through multiple nodes, which automatically arranges for calls to the object to be proxied though the middlemen. But, the TCP socket buffer size only approximates the BDP of the first hop. A better solution would measure the end-to-end BDP using an algorithm like BBR. Expect future versions of Cap’n Proto to improve on this.
Note that this new feature does not come with any change to the underlying RPC protocol! The flow control behavior is implemented entirely on the client side. The -> stream
declaration in the schema is merely a hint to the client that it should use this behavior. Methods declared with -> stream
are wire-compatible with methods declared with -> ()
. Currently, flow control is only implemented in the C++ library. RPC implementations in other languages will treat -> stream
the same as -> ()
until they add explicit support for it. Apps in those languages will need to continue doing their own flow control in the meantime, as they did before this feature was added.
HTTP-over-Cap’n-Proto
Cap’n Proto 0.8 defines a protocol for tunnelling HTTP calls over Cap’n Proto RPC, along with an adapter library adapting it to the KJ HTTP API. Thus, programs written to send or receive HTTP requests using KJ HTTP can easily be adapted to communicate over Cap’n Proto RPC instead. It’s also easy to build a proxy that converts regular HTTP protocol into Cap’n Proto RPC and vice versa.
In principle, http-over-capnp can achieve similar advantages to HTTP/2: Multiple calls can multiplex over the same connection with arbitrary ordering. But, unlike HTTP/2, calls can be initiated in either direction, can be addressed to multiple virtual endpoints (without relying on URL-based routing), and of course can be multiplexed with non-HTTP Cap’n Proto traffic.
In practice, however, http-over-capnp is new, and should not be expected to perform as well as mature HTTP/2 implementations today. More work is needed.
We use http-over-capnp in Cloudflare Workers to communicate HTTP requests between components of the system, especially into and out of sandboxes. Using this protocol, instead of plain HTTP or HTTP/2, allows us to communicate routing and metadata out-of-band (rather than e.g. stuffing it into private headers). It also allows us to design component APIs using an object-capability model, which turns out to be an excellent choice when code needs to be securely sandboxed.
Today, our use of this protocol is fairly experimental, but we plan to use it more heavily as the code matures.
KJ improvements
KJ is the C++ toolkit library developed together with Cap’n Proto’s C++ implementation. Ironically, most of the development in the Cap’n Proto repo these days is actually improvements to KJ, in part because it is used heavily in the implementation of Cloudflare Workers.
- The KJ Promise API now supports fibers. Fibers allow you to execute code in a synchronous style within a thread driven by an asynchronous event loop. The synchronous code runs on an alternate call stack. The code can synchronously wait on a promise, at which point the thread switches back to the main stack and runs the event loop. We generally recommend that new code be written in asynchronous style rather than using fibers, but fibers can be useful in cases where you want to call a synchronous library, and then perform asynchronous tasks in callbacks from said library. See the pull request for more details.
- New API
kj::Executor
can be used to communicate directly between event loops on different threads. You can use it to execute an arbitrary lambda on a different thread’s event loop. Previously, it was necessary to use some OS construct like a pipe, signal, or eventfd to wake up the receiving thread. - KJ’s mutex API now supports conditional waits, meaning you can unlock a mutex and sleep until such a time as a given lambda function, applied to the mutex’s protected state, evaluates to true.
- The KJ HTTP library has continued to be developed actively for its use in Cloudflare Workers. This library now handles millions of requests per second worldwide, both as a client and as a server (since most Workers are proxies), for a wide variety of web sites big and small.
Towards 1.0
Cap’n Proto has now been around for seven years, with many huge production users (such as Cloudflare). But, we’re still on an 0.x release? What gives?
Well, to be honest, there are still a lot of missing features that I feel like are critical to Cap’n Proto’s vision, the most obvious one being three-party handoff. But, so far I just haven’t had a real production need to implement those features. Clearly, I should stop waiting for perfection.
Still, there are a couple smaller things I want to do for an upcoming 1.0 release:
- Properly document KJ, independent of Cap’n Proto. KJ has evolved into an extremely useful general-purpose C++ toolkit library.
- Fix a mistake in the design of KJ’s
AsyncOutputStream
interface. The interface currently does not have a method to write EOF; instead, EOF is implied by the destructor. This has proven to be the wrong design. Since fixing it will be a breaking API change for anyone using this interface, I want to do it before declaring 1.0.
I aim to get these done sometime this summer…