Rust, Looking Forward in 2018
This past year I was pretty well a full-time Rust developer working on a handful of key projects:
Each of these saw a non-trivial amount of work poured into them by myself and others. I wrote some – but not enough – about the work that went on:
- Announcing Cernan, in which Cernan is open-sourced
- How can I optimize this data structure?, in which
quantiles::ckms::CKMSgets an 80% performance bump with help from the Rust community
- Encheapening Cernan Internal Metrics, in which atomics find use in cernan to reduce self-telemetry contention
- Hopper rework, in which I slowly whittle down the poor contention performance of postmates/hopper
- American Fuzzy Lop'ing Rust, in which I describe running AFL against quantiles' CKMS implementation
- John Koenig has been landing excellent patches to cernan, improving its at-least-once delivery story and integrating async IO:
- Introducing Graceful Shutdown
- At Least Once Delivery on Graceful Shutdown
- Introduces Kinesis Sink + Improves Sink Interface.
It's been a busy time for the Rust community in general. The compilation time required to generate objects feels much improved since the start of the year and the introduction of
cargo check for sure improves my life as a developer. Rust's compile-time messages also got quite a lot more helpful over the year, which I've noticed as 2017 turned out to be the first year I was coaching folks new to working in Rust. It's a small thing, maybe, but my favorite fiddly improvement to Rust came in 1.18.0 in the form of automatic struct field re-ordering: just one less thing to have to keep in mind. There are also notably more 1.0 versions of important crates in the ecosystem, a welcome state of affairs for cernan's growing dependency list.
What do I aim to get done or hope to see done in Rust in 2018? In the cernan project we have three main objectives:
- Improve hopper's performance,
- fork rust-lua53 and
- get cernan back onto crates.
Hopper is cernan's disk-backed MPSC implementation. While it works pretty well it is not as fast as I would like it to be in high-contention environments which, uh, is the environment cernan puts it in. Mostly this is just down to there being one Big Dumb Lock for serializing reads and writes. But, when I ask myself "What could I do to improve cernan?" hopper is where my mind heads to immediately.
Second and third things, way back in April 2017 I had reports that MUSL builds of cernan were failing because of the lua dependency's need for GNU Readline. I submitted a patch to upstream but never heard back. Which, makes sense. Maintaining open-source software can be a real drain. Anyway, we forked our lua dependency with the required fix and never pushed this fork up to crates. That means, since that time, cernan has been off the crates ecosystem. I'd like to make a fair few changes in the way cernan uses lua internally and I think it makes sense to fork and support a lua now. Or, at least, investigate the alternatives that exist in the ecosystem. But! Cernan will be back on crates in 2018.
There are also improvements that I would like to make to cernan but can't yet, pending stabilization work in the Rust compiler. For starters, cernan is a big project and many people who use it don't enable all of its features. Still yet, we compile all those features in and that's a drag for compilation time and binary bloat. I think I need only a few things from rustc to use feature gates effectively, captured in this issue. Here they are anyway, to save you a click:
In quantiles there are a number of compaction tricks I could pull off if I had the ability to specialize an implementation over types. For instance, quantiles CKMS is generic with regard to some numeric type
T and we use contiguous storage of
Ts to answer quantile queries. If
T is small enough there's no reason why we can't default to storing multiple
Ts into a single machine word or toss out saturating numeric operations in cases where we can reason that for some
T overflows in the compaction step would not happen. That'd be Tracking issue for specialization (RFC 1210) completed, I think.
With regard to teaching folks Rust, I sure do wish there were a book-standard way of hooking gdb/lldb into test runs. Honestly, this is probably just something I need to do a little research on but debuggers and tests keep coming up and I have no answer.
Further with regard to teaching folks Rust, somehow every person I've helped this year has got stuck on two things:
- ownership in HashMaps and
- writing iterators
I think Tracking issue for RFC 2033: Experimentally add coroutines to Rust will go a long way toward resolving the issue folks have with writing out iterators long-hand. It's not hard to do, just fiddly, and until you see some examples it's easy to goof them. I did my own self until a kind soul came along and redid the hopper iterators right. With regard to HashMaps, I guess my thoughts there are a little more nebulous. Hash maps are a really common, basic building block structure in dynamic languages. Seems to me that part of the issue people have learning Rust is they structure code in terms of HashMaps and run into ownership issues pretty quick, faster than they might normally. If-not-exists-then-insert head-scratchers are resolved by using
HashMap::entry but somehow
entry has yet to be obviously named enough for new-to-Rust programmers. Ends up being a pretty good example for ownership discussions, that said.
Oh! I wish Rust/AFL worked on OS X. My work-laptop is OS X and it's been a real pain keeping a Linux laptop handy for AFL runs. Seems like Invalid section names in Mach-O file on OS X w/ custom target spec is the only outstanding issue. Cargo fuzz is pretty spiffy, though.
Lastly, maybe just maybe we'll release cernan 1.0 this year.