RunPage tech overview: JS Sandboxing

In this post I will explain how RunPage runs the sandboxed Javascript code in your browser.

How the sandboxing works

It achieves sandboxing by running the provided code inside a dedicated Web Worker. The worker first instantiates a constructor of Async function using the following code.

const AsyncFunction = Object.getPrototypeOf(async function(){}).constructor;

This constructor is used to a create an async function with the script-block code as the function body, and executed as below.

try {
    const f = new AsyncFunction("globalThis", "api", "\"use strict\";\n" + scriptBlockCode);
    result = f(SharedGlobal, Api);
} catch (e) {
    // Report script error
}

SharedGlobal is the globalThis object using which script-blocks on a page can share objects among themselves. Api provides access to all the apis provided by RunPage.

The worker is instantiated when the page is executed. The same worker instance is used for all script blocks on the page and is disposed when the execution is complete. So for every run a new worker instance is created and disposed-off immediately. This ensures so memory leak persists from one run to another and the states are properly reset on every run.

The use of worker also ensures that there is no DOM access, however other browser apis like fetch etc. are available.

The main thread which initiates the worker, works by passing code of each script-block to the worker one-by-one. When the code of first script-block is executed and the main thread gets the output then only it sends the code of next script-block for execution. This means on error the main thread can terminate the process then and there and skip the rest of the script-blocks. Also this allows the main thread to set a time limit for each script-block execution. If it does not hear from the worker within the set time it can destroy the worker, effectively killing that run.

Finally the use of worker ensures that the UI is not frozen while the script-block codes are running.

Challenges with the implementation

The biggest challenge is passing data between the main thread and worker. The browser auto serializes objects when passing between these two domains. However, few objects cannot be serialized like functions which have captured a scope, etc. So many complex objects are converted into JSON before sending across the domains.

Some apis provided by RunPage allow access to other blocks on the page, like file selector, input and table blocks. These actually require access to those blocks’ DOMs. The api on the worker side does this by passing instruction messages to corresponding “server” code living on the main thread. The code on the main thread access the DOM and gets appropriate data from them and passes them back to the worker.

There is one more challenge which I have not been able to solve yet. It is reporting clear precise error. Right now the stack trace is captured and presented as output to the page user but the stack trace includes code lines from the worker and hence could be confusing to end-user. Also it does not report clearly which exact line and column in the code in the script-block ran into error. Fortunately the code can still be debugged by putting a debugger statement in the script-block code and opening the browser console. The browser will correctly pause at that point and full browser debugging facility can be used.

2 Comments

Leave a Reply to RunPage tech overview: AWS Textract integration | AppleGrew's Mind Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.