Over the past few days, I’ve been working on a SPH fluid simulator with rust, compiled to wasm. The goal here is more “real-time & pretty-looking” than “offline and accurate”. sandspiel was a great motivator to start working on this.
Update: You can check out the demo here, or the source code here
I like the way cargo
and rust handle having different targets. rustup target list
lets you see a list of targets that rust can compile to, and cargo build --target
lets you build your rust codebase into a specific target. wasm-pack
pretty much handles all the details however, making it a breeze to get started. I used the Rust and WebAssembly tutorial to get started, and this covered a lot of the bases.
While I had used rust a bit in 2018, I never worked on a substantial project with it. My memory of the language is quite faint, and I had to keep referring to The Rust Programming Language book. I still haven’t dealt with smart pointers and the like. My point: I am new to rust. Its also worth noting that the rust+wasm ecosystem is quite new, and should get much better over time.
I’ll write about the fluid simulation in a later post. This post documents my experiences with the language, rust. (I might revisit this a few weeks later and find some things silly. Such is learning.)
Here are my observations, in no particular order:
The compiler gives quite nice error messages:
error[E0425]: cannot find value `width` in this scope
--> src/universe.rs:38:31
|
38 | let accel = Grid::new(width, self.height, H, &self.particles);
| ^^^^^ help: try: `self.width`
That was a clean error message with a very specific possible fix. Here’s another:
error[E0282]: type annotations needed
--> src/universe.rs:143:13
|
143 | let ddv = 2.0 * izip!(&neighbours, &x_ijs, &dWs).map(|(pj, x_ij, dW)| {
| ^^^
| |
| cannot infer type
| consider giving `ddv` a type
There were a lot better examples but I cannot recall how to reproduce them.
Targeting wasm has its ups and downs. I understand that it is still in early development; and most of the warts will get sorted out over time. For one, I found it quite odd to have to use the js_sys
crate to sample random numbers(according to rust-wasm book). This made it extremely difficult to debug my non-wasm target. Eventually I found out that the rand
package works on both wasm and non-wasm, and have been using that since. Another dependency didn’t have wasm support, but I could get around this with feature gating in Cargo.toml
:
[target.'cfg(not(target_arch="wasm32"))'.dependencies]
kiss3d = "0.20.1"
nalgebra = "0.18.0"
Note: While kiss3d
claims to support wasm, I want more control over the WebGL rendering. I’m using kiss3d
just to debug my code without going through the browser every time.
A super-simple way to step through my code would be nice. It seems possible, but a more out-of-the-box solution like QtCreator or pycharm provide would be a great addition to the rust ecosystem. IntelliJ Rust doesn’t support debugging (yet?). Debugging the wasm code seems more difficult, but thankfully I have been able to test everything on a Linux binary directly before generating wasm code. println!
debugging can only get you so far. Step-by-step debugging is a great way to understand the control flow of an existing code base.
The rust plugin for Visual Studio Code handles auto-complete well. The only problem is the Rust Language Server seems to compile everything again, bloating the required hard disk space.
The people on the discord channel and r/rust are very nice and helpful. This creates a conducive environment for learning and making progress.
A lot about getting traits to work right feel like black magic in rust. For instance, its not possible to simply add a “trait object” to a struct, like:
trait Accelerator {
// ...
}
struct Universe {
accel: Accelerator, // This does not work
}
The reasoning behind this is explained in the advanced types section of the book, but simple OOP patterns shouldn’t require this much friction to implement. The error messages talk about dyn
and the Sized
marker; and while I shall eventually dig into this, it seems excessive at this point. I ended up using a Box
type to store the “trait object”, which works but requires a dereference every time I want to call a function of the trait.
Also, wasm_bindgen
does not yet support lifetimes and traits for struct definitions.
This is one of those things I’ll curse when I have to deal with, but its for the best. Warnings about unused functions help catch bugs and prune old code.
(title) But it is much faster once cached.
This is once area in which rust truly shines! By following this and this, it was quite straightforward to generate a flamegraph1:
This helped me pin down some bottlenecks, such as the Vec
spending too much time reallocating memory, which I avoided with a Vec::with_capacity
.
Flamegraphs aren’t very useful without solid benchmarks however, and that’s where criterion
fits in! This is a fantastic library that lets you perform statistical analysis of performance improvements/regressions. criterion
gives you confirmation that your change is in fact increasing(or decreasing) the performance of your program. Here is a sample output from criterion
:
solver_step 0.001 time: [3.2821 ms 3.3249 ms 3.3661 ms]
change: [-52.895% -51.747% -50.570%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Now I KNOW that my change had a positive influence on my runtime! criterion
also leverages gnuplot to generate reports:
Its super important to measure performance and identify bottlenecks in performance-critical code, and tools like criterion
and flamegraphs help.
While multi-threading for wasm isn’t ready yet, I still wanted to play around with rayon for regular binary targets. This is what the docs had to say:
First, you will need to add
rayon
to yourCargo.toml
and putextern crate rayon
in your main file(lib.rs, main.rs)
.Next, to use parallel iterators or the other high-level methods, you need to import several traits. Those traits are bundled into the module
rayon::prelude
. It is recommended that you import all of these traits at once by addinguse rayon::prelude::*
at the top of each module that uses Rayon methods.These traits give you access to the
par_iter
method which provides parallel implementations of many iterative functions such asmap, for_each, filter, fold,
and more.
I followed those steps, and replaced iter
s in my hot loops with par_iter
s. In addition, I had to add the Sync
marker to one of my struct’s members. And then it just worked! Easy 2x performance improvement. It works well here because there are no mutable data dependencies between the threads: I didn’t spend any time fighting the type system. YMMV.
emu is also on my radar, if I ever want to run it on the GPU.
There are way too many rust concepts that need to be learnt before you can build a usable program. While the borrow checker, lifetimes, rust’s smart pointers, etc may be simple to someone experienced, they induce a lot of friction for a beginner working on a first program.
Look at the amount of redundancy here:
impl<'a, T> Accelerator for Quadtree<'a, T> where T: HasPosition {
fn nearest_neighbours(&self, i: usize, r: f32) -> Vec<usize> {
// ...
}
}
impl<'a, T> Quadtree<'a, T> where T: HasPosition {
pub fn new(width: f32, height: f32, items: &'a [T]) -> Self {
// ...
}
}
Additionally, when lifetimes and traits accumulate, the syntax can get quite reminiscent of C++.
I used the kiss3d
library to create a debug visualizer. It was quite simple to get cargo
to generate only the wasm library, or only a Linux binary. This workflow makes it easy to debug work in progress. This might change as I diverge the features between the two builds, obviously.
clamp
for floats is still a pending RFC. Its nice that the community discusses these changes extensively, but some working version of clamp with its behaviour documented would’ve been nicer.Using rust is mostly pleasant. I’ll have more to write about once the renderer is built. Additionally, it seems prudent that I understand more about the borrow checker and the means offered by Box
, Rc
, and family to work with it.
perf record --call-graph dwarf
and pipe the output of stackcollapse-perf
to rust-unmangle
to get better flamegraphs.
[return]