VARUNA JAYASIRI

@vpj

Shared memory with Node.js

May 13, 2015

This is more like a tutorial on writing a simple node.js add-on to share memory among node.js processes.

One of the limitations of node.js/io.js is that they are single threaded. Only way to use multiple cores in the processor is to run multiple processes1. But then you are working on different memory spaces. So it doesn't help if you want multiple processes working on the same memory block. This is required in memory intensive tasks that cannot be efficiently sharded.

All the source code is available in Github.

Node addon

You need node-gyp installed to build the node module.

npm install node-gyp -g

I think the node.js version matters as the addon api has changed. I was working on node 0.12.2, when I tested this out.

binding.gyp is required by node-gyp to build the addon.

Then comes shm_addon.cpp. This is a very basic addon that has one export createSHM, which creates a shared memory block of 800,000 bytes (attaches if exists) with read & write permission to all users2.

Shared memory is allocated with shmget and attached to the address space of the process with shmat.

shmid = shmget( key, MEM, IPC_CREAT | 0666 );
data = (char *)shmat( shmid, NULL, 0 );

It keeps a pointer to the memory block and returns it if createSHM is called twice by the same node.js program3. createSHM returns an ArrayBuffer, initialized with the pointer to the shared memory.

Local<ArrayBuffer> buffer = ArrayBuffer::New(isolate, (void *)data, MEM);

The node module shm_addon is built with node-gyp with following commands.

node-gyp configure
node-gyp build

The node addon will be created in build/Release/shm_addon.node.

Parent and child programs

This is a simple counting problem to illustrate how shared memory can be used. We will populate the array of 200,000 32-bit integers with the sequence 0,1,2,...998,999,0,1,2,..998,999,0,1,2,.... So there are 200 positions with each integer between 0 and 999. Each of the child programs (workers) will count the number of occurrences of each integer between 0 and 999 by inefficiently traversing the array a 1,000 times.

spawn.coffee is the parent program that starts the child processes. child.coffee is the child program.

Shared memory is attached by parent program and child program by calling the node addon.

shm = require './build/Release/shm_addon'
a = new Int32Array shm.createSHM()

We are calculating the time taken for the child processes to count. Time it takes for processes to get spawn and exit is excluded. Therefore the child processes start counting when they receive something in the standard input. Number of child processes can be set with CHILDREN.

process.stdin.on 'data', (msg) ->
start()

Running coffee spawn.coffee will start processes and do the counting and show the time it took to complete.

You can take a look at shared memory allocated by running command ipcs.

IPC status from <running system> as of Tue Apr 14 13:58:16 IST 2015
T     ID     KEY        MODE       OWNER    GROUP
Shared Memory:
m  65536 0x000019a5 --rw-rw-rw- varunajayasiri    staff
m  65537 0x000019a4 --rw-rw-rw- varunajayasiri    staff
m  65538 0x000019a2 --rw-rw-rw- varunajayasiri    staff

Results

bench.coffee was used to find the time a single process takes to count.

@chethiyaa did some testing on a quad core i7.

# childrensingle process (ms)multi process (ms)
1398430
2782394
41626415
83300799
1662851594
323183
646372
12813049

1 Node modules like parallel.js fork new processes when on node and use web workers on browser.

2 shmget (documentation) allocates shared memory and shmat (documentation) attaches the shared memory block.

3 Since the ArrayBuffer is constructed with a memory pointer, it will be external. That is the memory will not be garbage collected and the addon will have to free the memory. Here's the v8 documentation to ArrayBuffer.

Shared memory limits are quite small by default. So trying to allocate a lot of shared memory will give errors. This article gives details on changing and viewing these settings.

This is more like a tutorial on writing a simple node.js add-on to share memory among node.js processes. One of the limitations of node.js/io.js is that they are single threaded. Only way to use multiple cores in the processor is to run multiple processes^^1^^. But then you are working on different memory spaces. So it doesn't help if you want multiple processes working on the same memory block. This is required in memory intensive tasks that cannot be efficiently sharded. >>> ^^1^^ Node modules like <<http://adambom.github.io/parallel.js/(parallel.js)>> fork new processes when on node and use web workers on browser. All the source code is available in <<https://github.com/vpj/node_shm(Github)>>. #Node addon You need ``node-gyp`` installed to build the node module. ``` npm install node-gyp -g I think the node.js version matters as the addon api has changed. I was working on node --0.12.2--, when I tested this out. ``binding.gyp`` is required by node-gyp to build the addon. Then comes ``shm_addon.cpp``. This is a very basic addon that has one export ``createSHM``, which creates a shared memory block of 800,000 bytes (attaches if exists) with read & write permission to all users^^2^^. >>> ^^2^^ ``shmget`` (<<http://man7.org/linux/man-pages/man2/shmget.2.html(documentation)>>) allocates shared memory and ``shmat`` (<<http://man7.org/linux/man-pages/man2/shmat.2.html(documentation)>>) attaches the shared memory block. Shared memory is allocated with ``shmget`` and attached to the address space of the process with ``shmat``. ```cpp shmid = shmget( key, MEM, IPC_CREAT | 0666 ); data = (char *)shmat( shmid, NULL, 0 ); It keeps a pointer to the memory block and returns it if ``createSHM`` is called twice by the same node.js program^^3^^. ``createSHM`` returns an ``ArrayBuffer``, initialized with the pointer to the shared memory. >>> ^^3^^ Since the ``ArrayBuffer`` is constructed with a memory pointer, it will be ``external``. That is the memory will not be garbage collected and the addon will have to free the memory. Here's the **v8** documentation to <<http://bespin.cz/~ondras/html/classv8_1_1ArrayBuffer.html(ArrayBuffer)>>. ```cpp Local<ArrayBuffer> buffer = ArrayBuffer::New(isolate, (void *)data, MEM); The node module ``shm_addon`` is built with node-gyp with following commands. ``` node-gyp configure node-gyp build The node addon will be created in ``build/Release/shm_addon.node``. #Parent and child programs This is a simple counting problem to illustrate how shared memory can be used. We will populate the array of 200,000 32-bit integers with the sequence ``0,1,2,...998,999,0,1,2,..998,999,0,1,2,...``. So there are 200 positions with each integer between 0 and 999. Each of the child programs (workers) will count the number of occurrences of each integer between 0 and 999 by inefficiently traversing the array a 1,000 times. >>> Shared memory limits are quite small by default. So trying to allocate a lot of shared memory will give errors. This <<http://seriousbirder.com/blogs/linux-understanding-shmmax-and-shmall-settings/(article)>> gives details on changing and viewing these settings. ``spawn.coffee`` is the parent program that starts the child processes. ``child.coffee`` is the child program. Shared memory is attached by parent program and child program by calling the node addon. ```coffee shm = require './build/Release/shm_addon' a = new Int32Array shm.createSHM() We are calculating the time taken for the child processes to count. Time it takes for processes to get spawn and exit is excluded. Therefore the child processes start counting when they receive something in the --standard input--. Number of child processes can be set with ``CHILDREN``. ```coffee process.stdin.on 'data', (msg) -> start() Running ``coffee spawn.coffee`` will start processes and do the counting and show the time it took to complete. You can take a look at shared memory allocated by running command ``ipcs``. ``` IPC status from <running system> as of Tue Apr 14 13:58:16 IST 2015 T ID KEY MODE OWNER GROUP Shared Memory: m 65536 0x000019a5 --rw-rw-rw- varunajayasiri staff m 65537 0x000019a4 --rw-rw-rw- varunajayasiri staff m 65538 0x000019a2 --rw-rw-rw- varunajayasiri staff #Results ``bench.coffee`` was used to find the time a single process takes to count. <<https://twitter.com/chethiyaa(@chethiyaa)>> did some testing on a **quad core i7**. ||| | # children | single process (ms) | multi process (ms) | === | 1 | 398 | 430 | | 2 | 782 | 394 | | 4 | 1626 | 415 | | 8 | 3300 | 799 | | 16 | 6285 | 1594 | | 32 | | 3183 | | 64 | | 6372 | | 128 | | 13049 |