I’ve been learning Rust lately. I have also got back to playing futsal with some friends. Weirdly enough, these two worlds intersected a few weeks ago. This is a summary of my experience learning Rust. Starting with reading about it here and there, eventually getting the book, doing a Hackathon project at Onfido using it and then spinning up my own personal project as a way to further experiment with the language.
At this point, Rust is a pretty well-known programming language. I remember one of the first blog posts that really caught my attention to the language appeared expectedly (given Rust started at Mozilla) in the Mozilla Hacks blog. In this blog post, Lin Clark introduced Project Quantum. Rust was said to be “a language that was free of […] data races”, and Servo (the browser engine developed in Rust at Mozilla) “made full use of […] fine-grained parallelism”.
Ever since that “first” blog post, lots of articles have been popping up more and more on popular aggregators like Hacker News and Lobsters. And everytime, I would get that FOMO feeling. The tipping point came about when learning about more and more developers in the Elixir community which were looking at using Rust as a way to squeeze the most performance out of their resources. This was true for Bleacher Report, Discord and more recently SimpleBet.
To start out, I got the official book for “The Rust Programming Language” and started reading it at the end of my summer vacations. I stopped for a while, and started thinking about ways I could actually practice what the book was preaching. My initial idea was to implement the Skip List data structure in Rust. It’s one of the most fun data structures around so I was excited about reading the paper that originally introduced it and about the possiblity of implementing it in Rust, obviously!
So I set out to do so. I installed
rustup, started playing with
cargo to generate my “Hello, world!” project and moved on to a bit of online tutorial reading about idiomatically writing singly linked lists in Rust. My co-worker Tom Forbes, who had also been reading about Rust, warned me it might not be an easy task for a first project, given this sort of data structure implementation usually comes with lots of reference handling and non-trivial ownership transfers as a consequence of Rust’s ownership rules and borrow checking.
Around the time this was happening, we had a Hackathon at Onfido. My colleague Daniel Caixinha suggested we tried to rewrite one of our machine learning services in Rust, as a way to experiment with the language. He got me and Tom excited, and soon we were a team of 8 trying to actually re-write two microservices instead of just one. Having that much people on board was great. We got to feel different pain points for people more used to one or more of VM-backed, garbage collected and dynamically typed languages. I definitely felt it too. In the end, we completed the re-write of one of the projects, which allowed us to experiment with Rocket, Serde and the Rust bindings for Tensorflow. It was a great feeling, to have a working version of something after only 2 days of hacking, when the majority of people had not yet programmed a single line of Rust.
After the Hackathon, I got back to the singly linked lists. I was having to dwelve into
Box, lifetimes and lots of… I had been warned… ownership transfers. While it made sense, given the language’s constraints optimise for safety (while trying to not compromise on speed), it didn’t feel right.
But I found hope.
One of my personal Rust heroes is Andrew Gallant. He’s the maintainer of a number of cool projects like
rust-lang/regex. These are projects known for their performance. As good candidates for role-model projects, I went looking around the code. It looked so much cleaner than the stuff I was having to do to make a singly linked list work. 😭 Was I looking at idiomatic Rust code? Error handling with the
Result type, structs with well-defined fields and mostly primitive or user-defined types, almost never needing to explicitly set lifetimes (thanks to the lifetime elision rules).
While skip lists are fun (I highly recommend you reading the paper about them, and particularly understanding the trade-offs when comparing to other data structures), it didn’t feel like the right project to continue pursuing in light of the conflicting goals I had: learning the language, not getting sucked into premature optimisation details, and having fun!
One way I was having fun outside of programming, was by playing five-a-side football with a group of friends. We usually play at Estádio Universitário de Lisboa and the way I book the football or futsal pitch is by filling in an online form that sends an e-mail with my booking details. Everytime, wherever I am, I need to fill in the same details over and over again. Other than the day, everything is exactly the same: my name, fiscal number, phone number, e-mail, booking hours, preferred pitch, etc. One way to automate this would be to get a template of the absolutely necessary information for a booking, and have a way to submit that info without needing to go about writing it all down in the form ourselves.
I settled on inspecting the POST request. I now had a list of query string parameters where the information for the booking lived. I had the endpoint, and some headers and fields I wasn’t sure about. But that was enough to get going. I put my Rust belt on. Created some structs. Springled some
serde magic on top of them for easy serialisation to various different outputs (e.g., JSON, query string, YAML, etc.).
At this point, I was ready to hard-code some values to fill in the structs, but I didn’t have a way to perform an actual request yet. And so,
ureq came to the resque. When I started using it, async-await hadn’t landed in stable Rust yet, and the
ureq library provided an intuitive sync approach to HTTP request/response handling. It was also more than enough for what I was trying to achieve. Or so I thought…
std::net::ToSocketAddrs used by
ureq might return IPv4 or IPv6 addresses, and because the website we try to contact is not fully IPv6 ready, I needed to hack a way to set a preferred IP version in order to prioritise IPv4 on demand. You can find that here.
When doing the first couple of live requests (I was using a
sinatra Ruby mock server in the beginning), I couldn’t really get a successful e-mail back, although I was seeing 200 responses from the server. Apparently, the form just kept returning 200 even when it didn’t actually accept the request. After tinkering a bit with it, I understood this was actually a Drupal form that relied on two “special” fields:
form_build_id, which were fetched from the DOM of the retrieved form web page. This helps prevent basic replay attacks as well as CSRF. So now I needed to scrape the retrieved web page.
For scraping, I resorted again to a well-supported library in Rust.
scraper did the trick here. It has a nice API for selecting DOM nodes and searching in them. This was all I needed to fetch the
form_build_id from the DOM. Cool. Now I could make an actual request. After a few submissions with invalid input (e.g., phone numbers can’t have spaces in them), I got it working as expected.
I brushed it up with a nice looking CLI with two distinct forms of input. You can either use a list of option arguments for each field to send as part of the request, or use a YAML configuration file with the necessary form data. For YAML deserialisation,
serde was used again, via
serde_yaml. CLI functionality was achieved using the well-known
clap Rust library, which even allows you to use a YAML file to describe your CLI options.
I called it
There are still lots of improvements to make. If I think I’ll keep having fun doing those, I might get back to this project. Otherwise, I’ll just move on to something else. It was fun and it served its purpose well. Rust is nice. It is fun. It’s also rigid but with that rigidity comes power. Power to not fail miserably in production. Power to perform. If not in Rust, at least in futsal!