The Bixby Serverless Execution Infrastructure

Paper Author: Adonis Fung adonis.fung@samsung.com

This paper explains some of the security enhancements introduced in JavaScript Runtime Version 2.

Prior Work

Serverless computation is one of the major paradigm shifts advanced by modern cloud computing, where the scaling and maintenance responsibility of servers and language runtime are completely outsourced to cloud providers. Freed from many operational concerns, developers can focus solely on perfecting their application logic. To achieve that, cloud providers are obligated to create a secure and shared multi-tenant infrastructure to isolate and thwart cross-tenant and host-targeted attacks. Recommended and even made the default across all popular virtual assistant (VA) platforms, the serverless computing infrastructure is where different capabilities (Skills for Alexa, Actions for Google, or capsules for Bixby) execute to fulfill users' requests, or intents.

Note

For a complete glossary of terms used for Bixby Developers, see our Glossary.

The Needs

It is a convenient and straightforward option to accommodate cloud-based virtual assistant skills in the existing serverless computing infrastructure. When we looked more closely, we identified the following characteristics and questioned whether this was the right approach for our capsule developers.

A service most often encapsulates its business logic in its own servers, and the service is exposed through well-defined APIs serving different mobile and web clients.
With that, what a capsule needs to fulfill a user's intent is a small code snippet that formats slotted parameters from an intent to match different individual remote API requirements.

Do They Fit Virtual Assistant Use Cases?

Existing serverless infrastructure is often designed for compatibility and security reasons. To support as many language runtimes as possible, the server is designed to host and execute any binaries. In other words, they are designed for general computing, but not specifically for virtual assistant fulfillment. Essentially, the skills' fulfillment code is hosted as a web service in an existing serverless infrastructure in order to serve intent requests initiated from VA cloud servers. Below, we consider two of the most popular architectures, which their parent companies have both used to offer cloud-based virtual assistant skills.

In Firecracker (the backing Amazon AWS Lambda infrastructure), each tenant is isolated in an individual microVM with the Linux Kernel Virtual Machine (KVM) technology. A guest kernel all the way up to a language runtime is packed into a tenant unit. The KVM serves to provide the virtualization of hardware to the guest kernel, so the attack surface to the host kernel is greatly limited.

In gVisor (the backing Google Cloud Function infrastructure), each tenant is isolated in a Linux container. A user-space kernel and the language runtime exist in each tenant unit. All system call (syscall) requests are routed during runtime through the user space kernel, which on one hand exposes a wide range of available syscalls for the binary that runs on top of it, and on the other hand, confines the extent of impact to be within the container only. The syscall exposures that trickle down to the host kernel are significantly reduced.

We believe that using these existing serverless infrastructures is overkill to serve the computation needs for capsule fulfillment.

Our Approach

Building a new serverless infrastructure customized for capsule fulfillment is not an easy task. It involves very careful tradeoffs between efficiency, performance, and security. We focused on supporting Google's V8 JavaScript runtime, which backs the most popular browsers including Google Chrome as well as the NodeJS runtime, which is what is offered in existing serverless infrastructure for JavaScript developers. To give capsule developers the best development experience, the code submission, deployment, and execution are also streamlined and closely integrated with our platform.

The very first thing we did was to cut out all the unnecessary, enabled-by-default, and dangerous features from the bare V8 runtime, which includes WebAssembly, compilation optimizer, executable memory, and the ability to execute dynamic code (eval()). The complete and latest JavaScript language set is still well-preserved to developers, and is more than enough for the purpose of data massaging. Instead of offering hundreds of nicely-wrapped APIs (such as execve, socket, file, crypto, and http that touch a broad set of syscalls), we carefully added only the most needed features, including an HTTP request library, as well as config and secret libraries, to support authenticated calls to external API servers. To compensate for the performance lost by disabling compilation optimizations, we compile every single function module ahead of time and preload them in memory. This is desirable because (1) these function modules are mostly small code snippets, and (2) this avoids a class of type confusion vulnerabilities that were once classified as zero-days. This lightweight yet still powerful v8 runtime gives us a much cleaner slate than offering a bulky NodeJS runtime for the purpose of capsule fulfillment.

We do not externalize capsule code as web services, so they can never be reached from the outside, and therefore does not require request authenticity checks from developers. Instead, we host developer-submitted capsule code in our own secure serverless fulfillment infrastructure, which closely integrates with our platform. Upon receiving an execution request, our lightweight process is cloned into a separate Linux container. This per-request container has its own unique PID, user, mount, and network namespaces. The container is then mounted to only allow access to the compiled in-memory modules and code directory of that specific capsule. Simultaneously, the networking interface is also stripped, allowing only loopback. Together with blocked egress, the only way to communicate the execution result and make outgoing HTTP requests is to go through a well-defined messaging interface, where the execution request first comes in. This newly created container, as a per-request tenant, is then further confined by appropriate resource limits and tight seccomp (syscall filters) policies. The container will be terminated once the execution is completed. Such a per-request tenant model is very different from the existing model, where a tenant is supporting multi-requests. Our ephemeral nature enables us to eliminate some very popular vulnerabilities that are inherent to JavaScript, including rootkit and prototype pollution attacks. This can be achieved only with a lightweight container and JavaScript runtime. We can reduce and keep the latency of scaling out a per-request tenant to be a millisecond or two (as opposed to spawning an AWS microVM, which can take up to 125ms in case of cold boot and scale-out events).

While it is true that a guest and user space kernel is supposedly stronger and is a generic means to protect the underlying kernel, we approach the problem by reducing attack surfaces from the ground-up to produce a lightweight runtime while striking the right balance between security and performance. We can limit each untrusted per-request tenant to fire only 22 low-risk syscalls, which itself is a strong isolation against the underlying kernel. As a comparison, Firecracker needs 36 and gVisor requires 64 to function. As an additional layer of defense, we compartmentalize capsules into different VMs based on their riskiness.

In a nutshell, the serverless fulfillment infrastructure is yet another technological advancement brought by Bixby. Its novel design has fundamentally eliminated classes of vulnerabilities that are present in general JavaScript computing runtimes as used in other VAs (improper request authenticity checks, type confusions, dynamic code executions, rootkit attacks, and prototype pollutions), while minimizing latency, all without requiring capsule code to be externalized as web services. It provides strong isolation, efficient scaling, and performant execution. Perhaps most importantly, it is free to capsule developers!

Bixby Developer Center

The Bixby Serverless Execution Infrastructure

Prior Work

The Needs

Do They Fit Virtual Assistant Use Cases?

Our Approach

Related Resources