Unpacking Node.js Memory - From Raw Bytes to Usable Data
11 min read

Introduction: Why peek under the hood?

I have been working with Node for many years, and recently I figured that even though I use it daily, I am not actually sure with what exactly happens under the hood when it comes to some of the more advanced features. Features like file management, streams, buffers, and memory management.

Honestly, most of the time when working with Node, I do things the way I have always done, and usually AI helps me autocomplete the more advanced use cases, so things work - I really don’t have to care.

This is my small attempt to at least make things like ArrayBuffer, Buffer and TypedArray a bit more clear - even though you might not need it in your everyday work.

ArrayBuffer - raw memory

We have to start somewhere, and the best place is probably at the very beginning - the ArrayBuffer.

So what exactly is the ArrayBuffer? At its core, it is just a fixed size chunk of binary data. An array of bytes.

A quick reminder: a byte is 8 bits, and a bit is the smallest unit of data in a computer. It can be either 0 or 1. So a byte can represent 256 different values.

Let’s have a look on how to create an ArrayBuffer in Node.js:

const buffer = new ArrayBuffer(4); // 4 bytes

This creates a new ArrayBuffer of 4 bytes. We can store 4 raw bytes of memory in this. So what exactly can we do with it? Not much, actually. The ArrayBuffer is just a chunk of memory. We cannot read or write directly to it. We need to use a TypedArray or a DataView (the latter we won’t touch in this article) to access the data in the buffer.

Giving Meaning to Bytes: TypedArray

The ArrayBuffer does not have any value on its own. It can be interpreted in many different ways. This is where TypedArray come into play.

So, exactly how does the TypedArray allow us interpret things differently? We’ll dive into that, but first we’ll look into what a TypedArray is.

A TypedArray is actually exactly what it sounds like - an array that is of one specific type. Usually different types of numbers. You have all probably heard of floats, doubles and ints. I won’t dive into the details of these types, I will focus on the integers in this article.

Common Types of Integers

Some common types of integers are:

  • 8-bit integer (1 byte)
  • 16-bit integer (2 bytes)
  • 32-bit integer (4 bytes)
  • 64-bit integer (8 bytes)

The difference between these integers is the amount of memory they use, as you can see in the list. The more memory they use, the larger numbers they can represent.

  • 8-bit integer can represent numbers from -128 to 127
  • 16-bit integer can represent numbers from -32768 to 32767
  • 32-bit integer can represent numbers from -2147483648 to 2147483647
  • 64-bit integer can represent numbers from -9223372036854775808 to 9223372036854775807

When talking about numbers, you might also hear the term signed or unsigned. Simply put, this just means that the number can be negative or not. So a signed integer can be from -128 to 127, and an unsigned integer can be from 0 to 255.

The TypedArray

Some of the most common TypedArray types are:

  • Int8Array - 8 bit signed integer - 1 byte
  • Uint8Array - 8 bit unsigned integer - 1 byte
  • Int16Array - 16 bit signed integer - 2 bytes
  • Uint16Array - 16 bit unsigned integer - 2 bytes
  • Int32Array - 32 bit signed integer - 4 bytes
  • Uint32Array - 32 bit unsigned integer - 4 bytes
  • Float32Array - 32 bit float - 4 bytes
  • Float64Array - 64 bit float - 8 bytes
  • BigInt64Array - 64 bit signed integer - 8 bytes
  • BigUint64Array - 64 bit unsigned integer - 8 bytes

There are other ones as well, but in this article we will focus mostly on the integers.

To create a TypedArray, we need to pass the ArrayBuffer to the constructor of the TypedArray. We also need to specify the type of the TypedArray.

const buffer = new ArrayBuffer(4); // 4 bytes

const int8Array = new Int8Array(buffer);

console.log(int8Array); // Int8Array(4) [0, 0, 0, 0]

So What? How Does This Help Us?

The TypedArray allows us to interpret the data in the ArrayBuffer in a specific way. We can read and write to the TypedArray, and it will automatically convert the data to the correct type.

The thing is this - based on which TypedArray we use, we can interpret the same data in different ways. Same 4 bytes can mean two completely different things based on which view we use.

Let’s compare the Int8Array and the Int32Array:

const buffer = new ArrayBuffer(4); // 4 bytes

const int8Array = new Int8Array(buffer);
const int32Array = new Int32Array(buffer);

console.log(int8Array); // Int8Array(4) [0, 0, 0, 0]
console.log(int32Array); // Int32Array(1) [0]

// You can use TypedArray to write to the buffer
int8Array[0] = 1; // [1, 0, 0, 0]
int8Array[1] = 2; // [1, 2, 0, 0]
int8Array[2] = 3; // [1, 2, 3, 0]
int8Array[3] = 4; // [1, 2, 3, 4]

console.log(int8Array); // Int8Array(4) [1, 2, 3, 4]
console.log(int32Array); // Int32Array(1) [67305985]

What? What happened here? Let’s recap.

  • The smallest representable unit of data (besides a bit) is a byte.
  • We created an ArrayBuffer of 4 bytes.
  • We created an Int8Array and an Int32Array based on the same ArrayBuffer.
  • We wrote 4 bytes to the Int8Array. [1, 2, 3, 4].
  • The 8-bit array has 4 numbers.
  • The 32-bit array has 1 number.

As mentioned earlier - the TypedArray helps us interpret the data in the ArrayBuffer in a specific way. The Int8Array interprets the data as 4 numbers, and the Int32Array interprets the data as 1 number.

I won’t dive too deep into why that is, for that you need to read up on binary numbers and how they work - but just to simply summarise it:

  • An 8-bit integer requires 1 byte of memory to represent a number
  • A 32 bit integer requires 4 bytes of memory to represent a number, we have 4 bytes, so we can represent 1 number.

Final Words on TypedArray

The TypedArray is a powerful tool that allows us to interpret the data in the ArrayBuffer in different ways. However, it is important to know that the TypedArray do not own the data. It is just a view on the data. If we change the data in the TypedArray, it will also change the data in the ArrayBuffer. That is why, in the snippet above, we could use two “views” on the same memory chunk.

It is also possible to create a TypedArray directly, without using an ArrayBuffer. This is done by passing an array to the constructor of the TypedArray:

const int8Array = new Int8Array([1, 2, 3, 4]); // 4 bytes

The Buffer - Making Things Easier

What we have talked about so far is quite low-level, and I guess it might not be something you use every day. This next step might be a bit more familiar for you if you have worked with Node before - let’s talk about the Buffer.

A Buffer is a global object in Node.js that allows us to work with binary data. It does sound very similar to an TypedArray, does it not? Well, it is actually built on top of the TypedArray.

The Buffer is a subclass of Uint8Array, so you could say that it is a TypedArray. The difference is that the Buffer has some additional methods that make it easier to work with binary data.

Binary data refers to data stored as a sequence of bytes, which might represent various types of information like images, audio, or even text in a specific character encoding.

The Buffer is a bit more high level than the TypedArray, and it is easier to work with. For example, we can create a Buffer from a string, and it will automatically convert the string to binary data.

const buffer = Buffer.from("Hello World"); // 11 bytes
console.log(buffer); // <Buffer 48 65 6c 6c 6f 20 57 6f 72 6c 64>

const int8Array = new Int8Array(buffer); // 11 bytes
console.log(int8Array); // Int8Array(11) [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100]

Why Buffer?

As mentioned earlier, the Buffer is a subclass of Uint8Array, so it is actually a TypedArray. This means that we can use all the methods of the TypedArray on the Buffer. It also means that we can get the underlying ArrayBuffer from the Buffer:

const buffer = Buffer.from("Hello World"); // 11 bytes
const arrayBuffer = buffer.buffer; // ArrayBuffer(11)

These types are all tightly coupled, so why should we use the Buffer?

The Buffer was created to make binary data easier in Node. Besides the example above where we use it with a simple string, it also has methods for string encoding. It is also the default way to work with files.

Files and Buffers

When we read a file in Node, we usually get a Buffer back. This is because files are binary data, and the Buffer is the best way to work with binary data in Node.

Imagine we have a file called file.txt with the following content:

Hello World

We can read the file using the fs module in Node:

import fs from "fs";

const buffer = fs.readFileSync("file.txt");

console.log(buffer); // <Buffer 48 65 6c 6c 6f 20 57 6f 72 6c 64>

As you can see, we get a Buffer back. We can use the Buffer methods to convert it to a string:

const str = buffer.toString("utf8"); // 'Hello World'

We also get Buffers when working with streams. Streams allow us to read and write data in chunks, which is very useful when working with large files. When we read a file using a stream, we get a Buffer back for each chunk of data that is read.

import fs from "fs";

const stream = fs.createReadStream("file.txt");

stream.on("data", (chunk) => {
  console.log(chunk); // <Buffer 48 65 6c 6c 6f 20 57 6f 72 6c 64>
});

Why is it good when working with large files? Imagine you have a large file of 1GB. As it will be stored as chunk of memory - 1GB of 1s and 0s - it would be quite heavy to load it all into memory at once. Instead, we can read it in chunks, which allows us to process it without running out of of the available memory.

So, if we take a look at what we know now:

  • We want to read a text file in Node
  • We use the fs module to read it, and we receive a Buffer
  • The Buffer is a subclass of Uint8Array
  • The Uint8Array is a view on the ArrayBuffer
  • The ArrayBuffer is a fixed size chunk of binary data. It is just small pieces of memory.

Even though we started off small, with just a chunk of memory, we have now built up a small stack of abstractions that help us work with binary data in Node. We have the ArrayBuffer, the TypedArray, and the Buffer. Each of these abstractions helps us work with binary data in a more convenient way.

This contrived example just shows us how to work with text files, but the same principles apply to all types of files. We can read and write binary data using the Buffer, and we can use the TypedArray to interpret the data in different ways.

Final Words

Hopefully, this small introduction to the ArrayBuffer, TypedArray, and Buffer has helped you at least get a little bit more familiar with how Node works under the hood. It might not be the most useful thing to know for everyday work, but maybe it will make file operations a bit easier to understand.