Understanding language proficiency with environmental factor

I’ve been using Node.js as my main language for years now, coming from Javascript as a frontend developer trying to make my way upstream to the backend world.

While most languages are strongly tied to their execution environment, today we can see many languages (or runtimes) being moved across the stack to achieve the ultimate “1 stack, 1 language” graal.

I have been using AWS Lambda and other FaaS providers for a long time. The concept fascinates many developers, as it provides maximum abstraction for infrastructure. However, understanding the costs in terms of mere monetary expenditure does not fully capture the general cost.

The general cost is obscured by language proficiency being uncontextualized from the running environments. In other words, a skilled Node.js backend developer who develops "classic" long-running server processes could perform worse than a novice FaaS and cloud-native Node.js backend developer.

It’s no news that understanding thoroughly the execution environment could make the difference. For managed environment this is true more than ever and sometimes, trickier to understand.

XaaS providers are on a mission about empowering the user while keeping their own profit predictable. While they do this, they tend to provide more of a “User manual” then a “Repair guide/Technical manual” to the end user, leaving their payed support to try to fill the gap, keeping the secrets that could leak and inadvertently benefit the competiting companies

Key takeway: It’s always crucial to understand thoroughly the execution environment, and with FaaS and other managed solutions it’s still true, more than ever.;

Following the image in the mirror: a free dive into custom lambda base images

As said, general knowledge about how your Node.js function is invoked when using AWS lambda is available and well documented.

We we invoke a lambda, the request spins (or reuse - more on this later) a FirecrackerVM container

How this container actually receives the raw “data” from “outside” (eg. HTTP request parameters, raw data bytes etc…) is unclear.

What we do know instead, is how the data gets delivered to the hosting runtime, and how the output gets its way back to the caller/user.

The provided AWS runtimes implement the Lambda Runtime API Client (RIC) inside of them, so the end user doesn’t have to worry about this part by default.

As said before, not worrying about something means that somebody else is doing the work for us. And that somebody may not share the whole process while dealing with that part.

The AWS runtimes provided by default are completely managed, so you only need to specify which runtime you want to use (Node.js, Java, etc…), the proper runtime version and machine architecture, together with your code.

Here is the list of all the available provided runtimes;

Read more: https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html

As you can see, AWS allows the use of a custom runtime, to let the user define its own runtime and binary to be used.

The closes thing we can dig into to find out more on how this intermediate layer works is one of the provided base container images for those runtimes.

These informations are provided by AWS, but it’s not granted a clear access (even with the full Dockerfile provided)

Read more: https://github.com/aws/aws-lambda-base-images

Here is a Dockerfile example from the x86_64 base image for Node.js 18 runtime:

FROM scratch

ADD 30441d6ce90470c9604b932759e7f34055026a3ff96317f04bc09a4699048831.tar.xz /
ADD 7829ac41fab0373bf7e054ee844a8b5016a7a3ac054c320610febadc7208831b.tar.xz /
ADD a2bf88aa56cd464927ee361b327ae91b163074f559d83951208b8d6249ec5d76.tar.xz /
ADD bc1e45da307eac0fe6729de93733b142e8518c70c37c46ce879cb5920cccd5f1.tar.xz /
ADD cff0e394510a68bf111cfff9c0da93670b1c6caa9851aee7a0d7d0281ef5e6eb.tar.xz /
ADD dca5659218052c59006ac4c769e0b0fd5ffe3c98c5a7cf1627ada1c6e40e0745.tar.xz /

ENV LANG=en_US.UTF-8
ENV TZ=:/etc/localtime
ENV PATH=/var/lang/bin:/usr/local/bin:/usr/bin/:/bin:/opt/bin
ENV LD_LIBRARY_PATH=/var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt/lib
ENV LAMBDA_TASK_ROOT=/var/task
ENV LAMBDA_RUNTIME_DIR=/var/runtime

WORKDIR /var/task

ENTRYPOINT ["/lambda-entrypoint.sh"]

A bunch of tars being extracted inside a scratch base image, with some (important) environment variables.

Not much, but …

What is that lambda-entrypoint.sh? Where is my code? What function does call/include/import/run my code? Let’s tackle all of this things one at a time

Diving inside AWS base images… no seriously, dive in.

Let’s crack open the image generated by the above Dockerfile

I’ll use a very useful tool called dive. It comes handy for a lot of things regarding docker images composition and image layer inspection. It’s useful during image optimization and multi stage image building.

Read more: https://github.com/wagoodman/dive

AWS provided images are available both via Amazon ECR public image registry and Docker public registry, so I’ll take one of those, instead of building the image using the Dockerfile above. By doing this, I’ll be able to double check the steps extracted from the public registry with the one in the Dockerfile.

dive public.ecr.aws/lambda/nodejs:18

Here’s the result:

As you can see, the intermediate steps adds (listed in order as they are issued):

Container Entrypoint: Follow the white rabbit

The main entrypoint script for the container is (as stated in the Dockerfile ) lambda-entrypoint.sh:

This script need a specific handler name to be passed as the only parameter

This is the first layer that decides via the AWS_LAMBDA_RUNTIME_API env variable if the container entrypoint needs to run using the emulated AWS Runtime Interface or not

We will continue to keep the “common” flow, so no emulation is in place and the script in RUNTIME_ENTRYPOINTgets loaded

Let’s see what’s inside the default entrypoint in /var/runtime/bootstrap

This script includes lots of useful information to understand how and in which ways Node.js runtime fits inside a configurable and light running environment like the one AWS Lambda offers

Let’s try to take away some points:

The Node.js specific runtime version in exposed in the AWS_EXECUTION_ENV variable

The NODE_PATH variable is filled using a bunch of different folder. In this way, lots of dependencies will become visible and required from the runtime, without the need of bundling explicitely (eg. AWS-SDK is provided in the base image using this method)

The Node.js executable is loaded with a calculated NODE_ARGSarguments. These arguments may vary in number, but the ones always passed are:

The user have the chance to inject a intermediate wrapper between this bash script and the user function loader execution in Node using the AWS_LAMBDA_EXEC_WRAPPER env variable. If present and points to a valid file, it’s launched with the Node.js executable path and its arguments (NODE_ARGS) as parameters

Wow, lot of things going on right now, but eventually we did manage to understand a little bit more.

One of the most relevant things above others is there are a good number of bash wrapper scripts that gets loaded before the Node.js runtime. Those wrappers feeds parameters and options to the runtime to make it run aware of its context, as well as providing the user the ability to tweak and inject other wrappers and parameters even before loading (and executing) the actual Javascript usercode.

It is also clear that the provided image until this point does only provide a thin layer between the user’s code and the managed runtime solution. The actual interfacing between the external events is not to be found in the bootstrap part, with the only exception of the AWS Runtime Interface Emulator wrapper injection we saw.

However, the AWS RIC and the usercode loading and execution is implemented right in the next step so let’s keep chasing the white rabbit down the rabbit hole one more time and see the content of /var/runtime/index.mjs

Maybe ASCIInema is not the best way of sharing this… let’s try another way

AWS Lambda Runtime Interface Client implementation

If we try to dump the runtime folder using this command:

docker run --entrypoint /bin/bash public.ecr.aws/lambda/nodejs:18 -c 'yum install -y tar gzip &>/dev/null && tar czf - /var/runtime' | tar xzf -

we can inspect its content more clearly.

Note: We could also refer to this, but it’s unclear if and how often AWS updates that repository to the latest runtime provided;

The main thing we can observe in there is the presence of a custom HTTP client module with native bindings called rapid-client

Somebody online thinks that is related to Rapid API product, but I didn’t found any official reference about the matter

What’s indeed interesting is that in order to provide the right abstraction between the nature of the caller event and our function, AWS built a lot of HTTP APIs that abstracts whats behind the code invocation. While this approach enables a worry-free runtime environment, could potentially add some kind of latency due to the nature of HTTP communication (even if done in a local environment inside the container) via the available API(s) such as the AWS Lambda Runtime Interface Client.

My guess is that the communication between the Firecracker micro-vm instance that hosts our code and the outside caller in leveraging on the Firecracker VM VSock device

Read more: https://github.com/firecracker-microvm/firecracker/blob/main/docs/vsock.md

This architectural choice kind of forces AWS to implement the Host - Guest (VM) communication via virtual sockets and the only reliable way to abstract it across all the available runtimes (also to let custom runtimes implement the same mechanism with ease) is indeed providing an HTTP API as the only interface in between.

Considering the implementation side of the RIC, the way AWS managed the need to sync the RIC and the loader of the user code right in the same interface client was to glue them together, making it a double layered, duplex interface between the actual RIC (talking to the Runtime Interface API and keeping tracks of the invocations order), and the user runtime loader/executor (which finds then executes the user provided code)

Another thing worth noticing is about the fact that every user handler code invocation is being made in response of a I/O interaction (which is the RIC’s request handler). In Node.js, especially when using ESM, this impacts a lot about the execution order for microtasks, other I/O callbacks and scheduling and timer ticking callbacks

The fact that the RIC is the only valid trigger that executes our code (or we should say Function), should make us think about the way we write Node.js for lambda environments: Lazy loading resources, connection pooling, Modules dynamic import or runtime require(s), code paths warmup and data structures preloading, v8 tricks for strings and objects, code loading order and dependencies size, singletons via module patterns, code bundling, tree-shaking In other words, we are not 100% in charge not only of our execution environment (which is what we want sometimes as we chose a fully managed autoscaling FaaS solution) but we should keep in mind the the fact that if we write something as we would do in a long running Node.js process, we could end up with unexpected results of even wrong performance outcomes.

Useful Links

Firecracker VM

AWS Lambda Runtime API

AWS Lambda Official Base Node.js Images [DockerHUB, Amazon Image Registry]