Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
WebAssembly Micro Runtime (github.com/intel)
228 points by ColinEberhardt on May 10, 2019 | hide | past | favorite | 67 comments


Cool. The whole repo is about 24K lines, and the vmcore_wasm dir is ~8300 lines.

   ~/git/languages/wasm-micro-runtime$ find . -name '*.[ch]' |xargs wc -l |sort -n
   ...
    472 ./core/iwasm/runtime/vmcore_wasm/wasm-opcode.h
    478 ./core/shared-lib/utils/runtime_timer.c
    537 ./core/shared-lib/platform/zephyr/bh_thread.c
    581 ./core/iwasm/runtime/platform/zephyr/wasm_math.c
    590 ./core/shared-lib/mem-alloc/ems/ems_alloc.c
    841 ./core/iwasm/lib/native-interface/attr-container.c
    893 ./core/iwasm/lib/native/libc/libc_wrapper.c
   1220 ./core/iwasm/runtime/vmcore_wasm/wasm-runtime.c
   2160 ./core/iwasm/runtime/vmcore_wasm/wasm-interp.c
   2874 ./core/iwasm/runtime/vmcore_wasm/wasm-loader.c
  24534 total
There is also a C++ VM here that’s ~7700 lines. I haven’t tried either one but it would be interesting to see how they compare!

https://github.com/WebAssembly/wabt/tree/master/src/interp

   ~/git/languages/wabt/src/interp$ wc -l *.cc *.h
  1863 binary-reader-interp.cc
  3585 interp.cc
   642 interp-disassemble.cc
   740 interp-trace.cc
    43 binary-reader-interp.h
   735 interp.h
    94 interp-internal.h
   7702 total


Interesting, apparently WebAssembly is a relatively small spec to implement; someone at our local Rust usergroup has also shown off a WASM runtime that runs on microcontrollers.


Arduino-compatible?


How does this compare to Mozilla's wasi? [1]

[1] - https://wasi.dev/


WASI is a system interface for WebAssembly modules, it is not an implementation. Therefore, these are quite different.

However, they could have decided to use WASI as the interface for their wasm runtime, but it looks like they didn't


a bit off topic, but I was wondering, can WASI API be used directly from c/c++ without compiling to wasm ? Basically to have native perf but still benefit from the crossplatform APIs that WASI will expose


That depends entirely on the implementation (and leaves open the question of ABI and binary file format).


Doesn't WASI defines a sort of ABI (what are the data types, the functions, arguments, return types etc...) ?


Yes, but WASI doesn't define calling convention and binary format, because they are defined by WASM. Your proposed WASI-native would need to define them.


ah okay I see, thanks for your answer


It's likely this project was started before WASI was announced.


Gosh, somebody should be proofreading that README. It seems a little unprofessional of Intel when I cannot understand what's meant in certain sentences...


It was quite readable for me, although there were some sentences which stood out. How I see it is: A developer whose first language is not English (or even any of the Western Languages which share similarities) made a fairly good README - which could be improved further by native speakers.

If you ask me to write a similar README in Japanese/Chinese/Thai, I wouldn't even know where to start.


I envy people who suck at grammar but are still capable of expressing their ideas because they know their vocabulary. They somehow managed to succeed at the hard parts before mastering the easy parts of a language.


Arguably, vocabulary is the easy part.


No, vocabulary is the hard part. It's where you spend most of learning time.


If you define vocabulary to include not just the words, but the appropriate usage-in-context, then sure. Arguably that's not a short formal list of rules you memorize, but an intuitive understanding of thousands of specific cases that can only be acquired through massive exposure to the language.

I'm not sure what the word for that is, it's neither vocabulary nor grammar, and yet more important than either of them -- actually even more important than both vocabulary and grammar put together (lists of words and lists of rules).


Vocabulary is hard, but in asynchronous communication (like writing readmes) dictionaries allow you to reach far beyond your own vocabulary. Even choosing the right word for the specific context and audience is well covered by good dictionaries.

For grammar there are barely any useful tools. Even tools like Gramarly are fairly primitive.


Why dont you volunteer?


I was wondering if it would be suitable for micro controllers (with let's say a few hundred kB of memory, not MB). I can't immediately find the answer but there appears to be a port to Zephyr OS in the repo so that is promising.

There's also the beginnings of a sensor manager and API so clearly small embedded use cases are intended.

It's still a very small project (looks like one contributor so far), but that's no bad thing...


TinyGo (Go for embedded devices - AVR, Cortex, Bluepill, STM32F4, etc) compiles to Wasm as well:

https://github.com/tinygo-org/tinygo/

No garbage collector yet, but it's on the ToDo list and probably not too far off. :)


Do you really want garbage collection when your device has 2KB of ram?


It depends. There's no one-size-fits-everything approach for this stuff. :)


No, you can't run GC at all. The minimum size of WASM heap is 64KiB by its spec. Otherwise it should run in stack-only mode. So running GC for microcontroller with less than that size doesn't make much sense.


We already have Lua [1] and MicroPython [2] as well as a myriad of Forths, Scheme, etc running on MCUs. So arguably Wasm would be a better fit for an MCU over these two as it doesn't need to be parsed and the interpreter doesn't need to handle dynamic types.

[1] http://www.eluaproject.net/

[2] https://micropython.org/


What would be the advantage of running WASM on a microcontroller instead of compiling the code directly for the microcontroller's architecture?


I'm interested in WASM for microcontrollers to have the application guaranteed free of side-effects (due to the sandboxing). This allows to be test the logic thoroughly in isolation, and have reasonably high certainty that it does not do anything funky - like write to some hardware register in some edge-case.

Such code would not be able to do I/O directly, but must receive it from outside the sandbox. It could be a pure function implementing a state machine on form `state, outputs = next(state, inputs)`, where state,inputs and outputs are plain data. This structure is amendable to generating testcases, be it via property testing or fuzzing. Or regression tests by capturing a stream of state transitions in serialized form.

There are languages which makes this possible, like Haskell, or maybe even Rust or Ada. However in the embedded world, C is what everyone knows and uses, so there would be a benefit to staying within that ecosystem.


Cool, but aren't side-effects just what you want when using a microcontroller to, you know, control things?

I work in embedded development and the amount of pure logic is really small in our software but I guess that varies with the niche you're in.


All software is a model of side-effects, but if you delegate the execution of your side-effects to a third party (usually the language's runtime or an application framework) all of your code can be side-effect free. If neither of those things is available, you can at least isolate your side-effect peforming code to a small footprint.


Yes, you are very right. The function of most microcontroller/embedded systems are almost all input/output (sensing/actuating). The challenge is that combined software+hardware systems can be hard to test, especially together.

The traditional way to simplify QA is to use a Hardware Abstraction Layer in the code, which hardware side-effects happen through. So you have a classic layer cake like:

    | App Logic (imperative, stateful)         |
    | Hardware Abstraction Layer (interface)   |
    | Input/Output (implementations)           |
Then during testing of the software application logic, a mocked implementation of the HAL can be used. When done with due care, this works OK. In this basic model it is OK to read input and cause outputs "anywhere" in the application logic. That is easy and convenient, but (I argue) that this causes pure logic to be rare. Which is unfortunate, since it would be much easier to test. State also tends to be spread across the code-base, which can hide stateful behavior that is critical to test.

In the proposed model, the HAL functionality is split into two distinct parts: input and output drivers. And the application logic does not call the HAL, but gets called with Input and produces a description of Output. So the layer cake kinda tips sideways:

    HW input driver > Input (data) >  App Logic (pure) > Output (data) > HW output driver
                                   |                   |
                                   <      State        <
During test of application logic we have:

    input generator -> App Logic -> output capture
and can then perform validations on who sequences of Input/Output pairs. And debugging can trivially access whole sequences of State changes.

Those familiar with dataflow programming might find this very familiar. Frontend people might see parallels to unidirectional flow of data in reactive UI frameworks like React. Simulation minded people might see that this structure is very amendable to Discrete-Event Simulation.

I have used this model to good effect across many (relatively simple) embedded/IoT systems over the last few years. One thing I really like, is that it makes temporal logic very easy. Because in this model the current time (be it ticks or wall-time) is just a type of input. So it is easy to stimulate, visualize and make assertions across whole timelines of behavior, and one can see many such timelines at the same time. Similar to what Brett Victor showed with Mario in Inventing on Principle.

Have been meaning to write more about this for a long time, so apologies for the mini blogpost :) Some very related writing in http://www.jonnor.com/2017/03/host-based-simulation-for-embe...

PS: the comment by pault is to the point, and right on the money.


It's great to see these software ideas being used in embedded systems. From my limited experience with embedded code, most engineers in the domain treat their software more like hardware. They test it WITH specific hardware and don't abstract it, abstraction being more of a software thing.

I would like to read a blog post about what you're doing. It would also be great to connect it to the other software communities doing similar things with different names.

I think this encountered these ideas in the Java world around 2005, with their strong focus on testing. It's basically "dependency injection" or dependency inversion.

I just did some Googling and found this good overview:

https://gist.github.com/kbilsted/abdc017858cad68c3e7926b0364...

It's very much an idea in the "enterprise" software world. I've never been in that world but it does seem like they are grappling with complex problems and systems, and this architecture has proven itsef in that domain. It's not surprising to me that it's also useful in the embedded domain.

I would say it's just "modularity". If your dependencies are hard-coded, then you have no modularity in your software at all. The whole thing is one big ball of mud which you can either take or leave. "Functions" aren't really modular if they have nontrivial hard-coded dependencies! (i.e. particularly ones that do I/O or depend on state).

Other names:

- capability based security / Object capabilities (WASM is influenced by these ideas, which originated from Eros OS as far as I know. The E language was an influential object capability language.) The idea of "ambient authority" is useful.

- https://news.ycombinator.com/item?id=14523728 -- a thread where Go programmers are rediscovering dependency injection. You can "invert" state or I/O. Those are independent decisions, but the same concept. In my larger programs, I tend to abstract both of them.

- In Haskell, you use State and IO monads. They are parameters, not things you go "looking for".

- Good overview of enterprise world: https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-a... (Clean architecture, ports and adapters, onion architecture, hexagonal architecture, functional core / imperative shell, etc.)

I think Java eventually went overboard with "DI frameworks", which I never used. I just wire up the dependencies manually in main() and it works fine.

-----

Also, I bought a somewhat obscure many years about dataflow programming and hardware, which I think it somewhat related:

https://www.amazon.com/Programming-SIMPL-Way-John-Collins/dp...

At its most fundamental, SIMPL is a set of library functions which allow the passing of encapsulated messages between cooperating processes. These processes may be local to one host computer or spread over a network.


I assume this is meant for cases where there's no LLVM backend for the target architecture and you're forced to use the vendor's proprietary C compiler. Then you can use that to compile this runtime and run any program written in a WASM-targeting language on top of it.


I'm assuming safer loadable modules (there's no memory protection on most mcus) and choice of languages that have wasm backends. In some circumstances being able to run the same code in a browser in a test environment as on the device could also be great (for education for example).


I would imagine a scaled down and cut version of WASM could be viable on MCU's as a form of interpreted program to be loaded from an external memory device, the WASM compiler/interpreter would likely already take much of a small MCU's memory.


It will be much more ahead of interpreters running on MCUs (overhead of parsing alone is enough to disqualify them for production use,) but still hopelessly far behind native code.

WASM was made with running on regular desktop CPUs in mind. Wasm is not a truly a state machine, nor stack machine, but a horrible hybrid in between them thanks to design by committee.

Changeable locals, accessible undefineds, use of stack pointer protection technique that kills both pipelining and speculative execution. Yes, that's combined downsides of both stack and register VMs.

There were MCU that were naively running Java long time ago, but all failed on the market. The idea proved non feasible


So BASIC Stamp, was a dream... I know that was popular to do some automatization on industrial machines.


> It will be much more ahead of interpreters running on MCUs (overhead of parsing alone is enough to disqualify them for production use,) but still hopelessly far behind native code.

To be clear, this is also interpreted, so it doesn't have an advantage there.


> There were MCU that were naively running Java long time ago, but all failed on the market. The idea proved non feasible

Not sure they failed for technical reasons, embedded CPUs just got fast enough and typical memory big enough to run a JIT.


To begin with, why to run JIT if you can run optimised to the limit native code?

Really no value there. The positive connotation of "web native" label is not so positive in the world of dour and severe people which "hardcore embedded developers" are.


The main limiting factor for embedded is cost. If you start from the position that you want to run something that was written in Java then there will be a selection of ways to do that. Most Java applications will require external RAM to the CPU, the lowest cost DRAM chips will be whatever is used at that time in high-volume consumer electronics, when ARM chips with Jazelle were introduced the DRAM capacity that represented this optimal price had increased by a lot from when those ARM chips were designed, increased to the point that the increased memory requirements of a JIT were no longer a problem.


Yes, definitely aimed at embedded devices.


I'm always interested in running WASM in microcontrollers, but AFAIK there is a severe limitation of WASM spec: the minimum size of heap (WASM memory page) is 64KiB, which is too big for many of microcontrollers. Otherwise a WASM module must run in stack-only mode.

Unless this problem is resolved I don't think WASM can be adopt widely in the world of microcontrollers.


I'm working on a Wasm interpreter. Yes, Wasm ideally needs a smaller page size for embedded applications. But, the 64kB page size is more of an interoperability specification. If you own the environment and you're already doing bounds checking on memory accesses, you can treat the memory size as virtual and dynamically allocate only what is used.


> But, the 64kB page size is more of an interoperability specification.

Could you tell me a bit more on it? Does the particular number 64kB have something to do with?


Why is it a problem to not have the wasm heap with less than 64KB of memory? At that level you could deal with all your memory explicitly, by creating fixed size arrays.


We could and it is always better to have fixed sized arrays, but having no choices is totally different situation than doing something preferable. It will make your life much harder in some operations such as string manipulation.

Having minimum 64KB of WASM memory is still a huge cost for mainstream microcontrollers. Most of dynamic memory stuff in embedded system requires at most ~10KB (and often less than 1KB) of memory. We are just throwing away the rest of the precious memory.


On a microcontroller one probably controls both the runtime and the application. Is there something that prevents configuring the run-time to have only 1kB memory (stack and heap)?


The README says "small footprint" but does not seem to mention what this is numerically?


Big companies like Intel should be able to afford to pay technical writers whose native language is English. Consider:

> WASM is already the LLVM official backend target

This implies that the only thing LLVM compiles to is WASM.

Better would be something like "WASM is already an official LLVM backend target" or "LLVM already compiles to WASM".


> Big companies like Intel should be able to afford to pay technical writers whose native language is English.

Do they actually need native English speakers? I'd assume they'd just need people proficient in English :)


They don't seem to require a CLA right now, so, how about you send a pull request?


The "submit a pull request" method of responding to problems, especially for a project supported by billion dollar companies, needs to die in a fire.

I really prefer to have documentation written by people who understand how things actually work on the inside.


... but whose native language often isn't English.


I suspect the response will disappear as soon as people cease complaining about minor things in code/projects/items they got for free.


I doubt this was written by a technical writer. Most likely, it's maintained by the developers who work on the project.


Hold on... so if I implement a wasm runtime somewhere, I can compile random things for it. brb, writing wasm runtime for avr and z80


Note:

“Features

WASM interpreter (AOT is planned)”

You’ll compile a program to have it interpreted at the end.

Does anybody know some specific case where this direction is advantageous?


This is how a lot of interpreters work - they compile source code to some intermediary bytecode (usually an imaginary stack or register machine) and then interpret that.

It's usually faster than executing directly over some sort of AST, and also fast when you re-execute the same script over again (that's why Python creates .pyc files for modules).


In this instance I think that by AOT they mean that at some point the runtime will be able to compile WASM into machine code ahead of time and execute that instead of interpreting it during run-time.

WASM already is the intermediate format and VM that you describe.


> Does anybody know some specific case where this direction is advantageous?

Very slow memory. Decades ago, a lot of interpreters did the same trick internally:

1. tokenise code, 2. convert to some compact binary serialisation format used internally by the rest of interpreter.

The trick was to make interpreter code to remain in CPU cache as long as possible while the code being interpreted did not require fetching from RAM


Interpreters save space. That's why Apple II used interpretation extensively.


Seems quite interesting and hope to have more info about actual device.


Every time someone submits wasm related content, I feel obliged to link this classic talk:

https://www.destroyallsoftware.com/talks/the-birth-and-death...

Bernhardt, Gary – The Birth & Death of JavaScript (PyCon 2014)


Honest question, why is this talk so relevant? I agree it had a good foresight, but I watched it and am a bit surprised by how much it is mentioned.


My guess as to why it’s mentioned so much is that the general expectation is that many people haven’t seen it yet. When people do watch it, they’re surprised by how much good foresight there was, and so they repeat the cycle.


The presentation style is quite funny. It was even funnier at the time, when such a thing was considered almost inconceivable. Now, it seems prescient.


This talk is (thankfully) obsolete now. We don't need to write (or even to compile to) JS to have multi platform, we just need to write/generate wasm :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: