WebAssembly 101: Bringing Bytecode to the Web
Credit to Author: David Maciejak| Date: Thu, 13 Apr 2017 06:58:13 -0700
FortiGuard Labs has put together answers to some of the most frequently asked questions you may have about the new emerging technology called WebAssembly (WA).
What is WebAssembly?
WebAssembly is a low-level, portable, binary format for the web that aims to speed up web apps. It is designed to parse faster (up to 20X), and execute faster than JavaScript (JS).
When was it announced?
The WebAssembly Community Group was created in April 2015, with the mission of “promoting early-stage cross-browser collaboration on a new, portable, size- and load-time-efficient format suitable for compilation to the web.”
How do I start?
You will have to setup Emscripten SDK with Binaryen to convert your C/C++ or even Rust code to WA “.wasm” binary files, or use Lisp-like S-expression form as “.wast” (or .wat) text format, as explained in Figure 1, below.
Figure 1: from source code to the web
You can start with this online tool to get your hands dirty and have a quick look at it.
From the disassembled output on the right, you can see the two first lines:
0000000: 0061 736d ; WASM_BINARY_MAGIC
0000004: 0b00 0000 ; WASM_BINARY_VERSION
The first line is related to the magic number 0x6d736100, which is ‘ asm’. The second line shows the version number. Here, it is 0xb. Since the current WA version number is 0xd, the byte code generated by this online tool can’t be used in the current version of the web browsers. But it is still interesting to look at he code. When WebAssembly is finally released, the version number will be set back to 0x1.
How does it work?
Currently, WebAssembly needs to be loaded and compiled by JavaScript.
Basically, four steps are needed:
- Load the wasm bytes
- Compile them into a module
- Instantiate the module
- Run the function(s)
Which translates to:
fetch(‘your_code.wasm’).then(response => response.arrayBuffer()
).then(bytes => WebAssembly.instantiate(bytes, {})
).then(instance => instance.exports.your_exported_function ()
As you can see above, “WebAssembly.instantiate” can be used to compile and instantiate the module at the same time.
What can WebAssembly be used for?
WebAssembly, which is the next evolution of asm.js, uses a very restricted subset of JavaScript that is best suited as a compilation target for C compilers. It does not include JavaScript objects, or direct access to the Document Object Model (DOM). Essentially, it only allows arithmetic operations and manipulations on typed arrays.
Some preliminary demos showed that the best wasm implementation of Fibonacci number generation outperforms the best JS implementation by more than 350%.
For now, WebAssembly is simply mimicking what JS can do, but the plan is to extend it to things that it would be difficult to do in JS without making the language even more complex, like, for example, adding SIMD (Single Instruction, Multiple Data) support by default, threads, shared memory, and so on.
Popular video game editors are already on track, and have begun porting some of their 3D capable engines by merging WebAssembly technology with WebGL 2.0. For a good example, you can try Zen Garden from Epic.
Is this the end of JavaScript?
WebAssembly will help JS more than it will hurt it. It will bring language diversity and increased performance for critical functionalities in the web. It should not only be seen as an improvement to JS, however, but also to web browsers in general.
Five years from now, our usage of JS will be radically different. In many instances today, it can be extremely challenging to deal with JS code, most of which is often hidden behind tortuous libraries.
Because of its ease of use and simple design, we predict than more and more code will be transpiled from C++ or Python to JS, or even directly to WebAssembly. That means you will not have to learn a brand new coding language. The JS VMs will still be here, but tools will evolve to get the best out of them.
What’s the difference between Web Assembly and Rich Internet Applications built on top of MS ActiveX/Adobe Flash/Orcale Java Applet/MS Silverlight/Google NaCl?
RIAs failed to build a standard open format because distinct private companies promoted each variant separately.
For example, Microsoft was promoting ActiveX technology within MS Internet Explorer. That technology enabled developers to reuse packaged functionality through COM components into a Web page.
Google was providing a Native Client to let developers package some C/C++ code into the browser, but again, it was only supported by Chrome, and was not what we can really call portable.
A few years ago, Mozilla opened the performance door with the release of asm.js. They had the very first idea of using only a strict subset of JS. By limiting the language features, they were able to predict how the VM would react, and thereby improve performance by removing some unnecessary checks by making some assumptions, but it also affected the dynamic behaviour of the language.
All these technologies built the foundation of what WA is today.
WebAssembly runs within the JS VM and uses a subset of its feature, which means it will not only be compatible with any devices that are able to use the newest web browsers, but it will also be backwards compatible. To make this happen, a polyfill is still in development, but the idea is to translate each function into a semantically valid JS equivalent JS. It will be slower, but it will run.
What does it look like?
As its name implies, the final form of WebAssembly is a low-level byte code that can be translated to assembly – but not the same kind of CPU assembly you might already know.
Let’s take the “Hello world” example, which is the first thing many programmers try to achieve when learning a new programming language. (Note: While this is an example most programmers are familiar with, such an example doesn’t really fit the language as there are no printing functions in WA by default. That’s why the code below has to import the missing functions from a standard library via the JS and then pass the required arguments.)
The C library function, defined as size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream), writes data from the array pointed by ptr to the given stream.
The wasp code is followed by the wasm bytecode version, thanks to the online tool mentioned above.
Code 1: wast Hello World example (from github)
Code 2: wasm bytecode version of the Hello World example
Even if that bytecode could be written by hand, it’s highly unlikely that any programmers would do that. Instead, they can choose the wasp S-expression form (which is defined here), or other higher level programming languages which are more human readable, and can generate at least equivalent code, and at the most, compiler-optimized code.
Which level of security can we expect with WA, and what does it mean for the cyber threat landscape?
WebAssembly, when run in a browser, is designed to run in a safe and sandboxed environment, which like other web code, means it enforces same-origin and permissions policies. As defined on Wikipedia, same-origin policy “prevents a malicious script on one page from obtaining access to sensitive data on another web page through that page’s Document Object Model.”
This may sound like the best possible solution in the best of all possible worlds.
However, as the past has showed multiple times, malware authors always find a way to abuse or divert new technologies for their own good. For example, they are already using some chained JS obfuscation layers either from some popular available project, or just some homemade custom functions, with the aim of hiding from and evading antivirus detection.
So what we can easily predict is that WA could possibly be used as an advanced obfuscation or encryption layer. It’s not something that a good and trained analyst could not overcome, but debugging and digging into exploit kits are going to just get harder and longer.
Currently, if you are curious and right-click in your browser to see the wasm module, the result will depend on which Internet browser you are using. You might just see a “native code” function reference from the developer debugger, or a warning message, like for Mozilla Firefox (see picture 1), or WA code as text, like with Google Chrome (see picture 2).
Picture 1: Mozilla Firefox WA debugging console
Picture 2: Google Chrome developer tool for WA
Web browsers will need to evolve to embed smarter WA debugging tools. In any case, a malicious module can be downloaded (as it needs to run on your machine) and then disassembled to give the reverser an idea of its purpose. Unless, like in the case of .NET, some code obfuscator prevents reverting the bytecode back to the code is developed. And such a thing may not be nefarious. Sometimes, such an operation is legit and is used by source code owners to protect their intellectual property.
When will WA be released?
Mozilla Firefox 52 (released on March 7), Google Chrome 57 (released on March 9), and Opera 44 (released on March 21) already support and enable it by default. Other major browser vendors, like Microsoft and Apple, are also making progress with its implementation. You can follow the development status online.
How can it be disabled?
Disabling that feature depends on the web browser you are using.
For Google Chrome, enter the URL “chrome://flags/#enable-webassembly” and change the combobox entry to “Disabled”. Note, you will have to restart your browser for the change to be effective.
For Mozilla Firefox, enter the URL “about:config” and locate the preference called “javascript.options.wasm”, then double click on the boolean value to switch it to False, which effectively disables it.
-= FortiGuard Lion Team =-
Sign up for weekly Fortinet FortiGuard Labs Threat Intelligence Briefs and stay on top of the newest emerging threats.