Executive summary

Serverless servers that cold boot in single digit milliseconds

Tidedge Serverless turns full virtual machines into ultra fast serverless runtimes. Run anything you want. Node.js. Postgres. Nginx. Java. Custom binaries. Scale to zero on prem or in the cloud and still answer requests in under 150 milliseconds from hundreds of kilometers away.

Cold boot full VM in < 4 ms
Full stack serverless not just functions
Anywhere local. on prem. cloud
GPU passthrough with memory snapshots
Built for people who think traditional serverless is too slow and too limited.

What we built

Tidedge Serverless is a "serverless server" platform. It does not only run functions. It runs full operating systems and complete applications as if they were tiny functions. The same stack works on a laptop. In an air gapped datacenter. Or across cloud regions.

Core ideas

The platform is built from a new ultra fast VM runtime, a global content addressed store, and a load balancer that scales each VM to zero and back up in single digit milliseconds.

  • VVM runtime that can use different hypervisors and runs on any operating system
  • Custom Linux kernel tailored for instant startup and prebound sockets
  • Image layers split into distribution. framework. and application
  • Tiered CAS for kernels. layers. and persistent volumes
  • Cluster aware load balancing from single machine to multi region

Why this matters

Traditional serverless platforms restrict runtimes and struggle with cold starts. Containers are flexible but slow to start and heavy. We combine the flexibility of full VMs with the speed of edge functions.

  • Run anything. from Node.js to Postgres
  • Serverless databases and stateful services become practical
  • Exact same image locally and in production
  • Works on prem. in air gapped environments. and in the public cloud
Cold boot full VM
< 4 ms
unikernel style image. CAS based boot
Node.js API. 700 km away
< 150 ms
cold start. full runtime. JSON response
Postgres cold start
< 250 ms
existing data. query served on first boot

Architecture overview

At a high level Tidedge Serverless treats each VM as a function. Images are built from Dockerfiles into three kinds of layers. Distribution. framework. and application. A cluster aware VVM runtime starts these images in a few milliseconds.

VVM runtime

Virtual Virtual Machine

A thin runtime that can plug into different hypervisors. This makes the platform portable across Linux. Mac. Windows. and cloud providers. The VVM is responsible for loading the kernel. wiring the root filesystem. and starting execution.

Custom kernel

Prebound sockets and fast boot

We ship a custom Linux kernel that adds a module for "prebinding" network ports. The kernel can hand off pre opened socket handles directly to runtimes like Node.js. This removes connection overhead and was the last missing piece to beat Cloudflare Workers on cold start latency.

Layer system

Distribution. framework. app

Each image is built from three logical layers. An OS layer like Alpine or Debian. A framework layer like Node.js. Java. or dotnet. And an application layer. All layers are stored in a content addressed store using OCI compatible blobs.

GPU support

GPU passthrough with memory snapshots

Full GPU passthrough to VMs enables ML inference, rendering, and compute workloads. When snapshotting a VM, GPU memory is captured and restored, allowing instant resume of GPU accelerated applications.

  • Direct GPU access with near native performance
  • GPU memory included in VM snapshots
  • Instant restore of CUDA and OpenCL contexts
  • Scale GPU workloads to zero and back in milliseconds
Storage tiers

Three level CAS

Storage cost and replication matter at scale. We keep layers and volumes in a three tier layout.

  • Local CAS on each physical server for hot data
  • Shared CAS per datacenter or region
  • Central repository that holds everything and feeds the others

Layers can be cached and expired independently at each tier.

Networking

HTTP 1. HTTP 2. HTTP 3

The load balancer already supports HTTP 1 and HTTP 2 for ingress. HTTP 3 and QUIC support is nearly complete. Early tests show at least 30 ms saved on each new connection plus some very interesting possibilities for efficient cluster internal traffic.

Performance highlights

The platform is built with one ruthless metric in mind. How fast can we answer the first request when nothing is running. The answer is. "fast enough that it feels like a function call".

Cold start numbers

  • VM boot time below 4 milliseconds on typical hardware
  • Simple Node.js JSON endpoint responding in under 150 milliseconds from a region 700 kilometers away
  • Postgres VM with an existing dataset responding to a query in under 250 milliseconds
  • Custom kernel module that shaves 40 to 50 milliseconds off Node.js port binding alone

Compared to conventional platforms

These numbers are ahead of common edge function offerings. They are achieved without restricting runtimes. You are starting full operating systems. not stubs.

Example latency breakdown

# simple Node.js + express JSON endpoint
lookup: 0.008 s
connect: 0.025 s
start: 0.090 s
total: 0.109 s

# workers style function on Cloudflare. same test
lookup: 0.008 s
connect: 0.133 s
start: 0.015 s
total: 0.152 s

# Tidedge Serverless answers a full VM faster than an edge function

Use cases

Once a full VM can boot in a few milliseconds many workloads that were painful on classic serverless platforms become straightforward.

Full stack serverless

Run entire web applications as serverless units. Frontend. backend. background jobs. queues. No limits on libraries or binaries.

  • Back office apps
  • APIs with custom dependencies
  • Cron style jobs and schedulers

Serverless databases

Cold boot Postgres or MongoDB on demand. Use block device volumes for high IO workloads or S3 style volumes for event oriented workloads.

  • Bursty analytics workloads
  • Occasional data processing jobs
  • Per tenant database models

On prem and air gapped

Some environments cannot send data to public clouds. The same stack runs fully on prem and even air gapped.

  • Industrial and factory systems
  • Healthcare and regulated environments
  • Defence and critical infrastructure

GPU accelerated workloads

Run ML inference, rendering, and compute workloads with full GPU passthrough. GPU memory snapshots let you scale to zero without losing state.

  • On demand ML inference endpoints
  • Video transcoding and image processing
  • Scientific computing and simulations

Developer workflow

The platform uses Dockerfiles as its build language, so you stay in the workflow you already know. Build and run on macOS, Windows, or Linux, then ship the exact same image to your own racks or ours without inventing a new packaging step.

Use Dockerfiles. not a new DSL

The CLI includes a Dockerfile interpreter that supports multi stage builds. It can act as a general build server and can also output images ready for our VVM runtime.

# example application Dockerfile
# Stage 1: Build stage
FROM alpine:3.21.3 AS build

# Install build-essential for compiling C++ code
RUN apk update && apk add --no-cache build-base

# Set the working directory
WORKDIR /app

# Copy the source code into the container
COPY hello.cpp .

# Compile the C++ code statically
RUN g++ -o hello hello.cpp -static

# Stage 2: Runtime stage
FROM alpine:latest

# Copy the static binary from the build stage
COPY --from=build /app/hello /hello

# Command to run the binary
CMD ["/hello"]
EXPOSE 80

# turn it into a serverless VM image
$ edge app build
$ # test locally
$ edge app run
$ # push using OCI and makes edge aware of your application
$ edge app push

Same image everywhere

  • Develop and debug locally on your laptop
  • Run the exact same image in an on prem cluster
  • Run the same image in a cloud region
  • No containers. No Docker/Podman. No tricks needed

Observability

You get full logs from each VM. complete stdout and stderr. Metrics can be wired into your existing observability stack. In the future we plan to add auto instrumentation, beyond just basic metrics and logs.

Under the hood

For the low level crowd. here is a taste of what is actually happening when a request hits the platform.

Request path

  • Load balancer receives an TCP/UDP, message queue, or WEB request
  • It looks up the target application and checks existing instances
  • If none exist it asks the VVM looks for prewarmed applications
  • If none exist it asks the VVM for a snapshot
  • If none exist it asks the VVM for a new VM in that region
  • VVM pulls layers from the CAS if needed and assembles the image
  • The custom kernel prebinds the sockets
  • Init system maps the final application and hands off control
  • The framework or application process receives the already opened socket and responds to the request

Boot pipeline

  • Kernel image and rootfs are prepared as CAS objects
  • Pages are mapped in a way that avoids unnecessary disk seeks
  • Hypervisor is invoked with minimal configuration
  • Startup script jumps directly into the application entry point
  • For most applications, binding is skipped since the socket is already open

Roadmap

The platform already runs real workloads. The next milestones are focused on more protocols. more storage backends. and more regions.

HTTP 3 and QUIC everywhere Internal and external traffic over HTTP 3 to reduce latency and improve multi region connectivity.
More storage backends Additional S3 compatible and block storage providers. Integration with encrypted CAS backends to support untrusted cloud storage.
Auto-instrumentation of code Easily collect performance data, logs and spans without 3rd party libraries. Using eBPF/OpenTelemetry to keep observability overhead under 1%.
Multi-GPU and inference clusters Scaling GPU workloads across multiple GPUs and building inference clusters with automatic load balancing.

Try the technical preview

We are looking for people who like to break things. Infra engineers. serverless fans. low-level coders. If that is you we would love your feedback.

Get involved

The project is still young. The best way to shape it is to use it and tell us where it breaks.

Talk to us

Join the discourse to share feedback, ask questions, or just lurk and watch experiments.

Join Discourse →

Follow progress

Follow us on Linkedin for benchmarks, breakdowns, and weird kernel hacks.

LinkedIn profile →

Watch demos

Short videos that show real cold starts, VM builds, and latency tests.

YouTube channel →