From Python to Rust: First Impressions

I did a lot of systems programming in the last couple of months. In Python. Time to explore how I would do that in a language that is closer to the system yet brings guarantees I’d love to have in Python as well: Rust. But what are the core concepts, how does Rust code feel like? Let’s find out.

Motivation

A good share of my recent work involves building libraries in Python around proprietary communication protocols that handle data retrieval, decoding, validation, and proper error handling. These libraries are expected to be correct as well as robustly handling erroneous data. Runtime exceptions are to be avoided. From an early point on, I utilized type hints extensively together with the mypy for static type-checking to catch the subset of structural runtime errors already during development.

Python, typed

This so far works quite nicely by typing data at the method boundary (i.e., typing method signatures and let mypy infer the rest). Code then looks like this:

import typing
import serial as pyserial
def read_packet(
    conn: pyserial.Serial,
    checksum: typing.Callable[[bytes], bytes],
    timeout: typing.Optional[float] = 2.5,
) -> typing.Tuple[bytearray, str]:
    ...

Couple of thoughts while pondering over this kind of code:

  • Typing. Has been a deliberate choice as explained above. Apart from the open discussion in the Python community about the extend type hints shall be employed and their future direction, type hints trade dynamic for robustness. Which is fine in my case.
  • Systems programming. Python is very well applicable in the world of systems programming (think RPi, Arduino, etc.). On the downside, Python is certainly not famous for its speed.
  • State, mutable vs immutable. Control over the state and its mutation.

With concerns like the speed of the Python interpreter aside, I was curious to explore whether other languages would suit me better in this particular realm, combining the speed of programming (including typing everything), robustness, and the joy of programming I feel when using languages like Python or Ruby.

Being gone this far already with respect to catching type errors during development rather than execution time, the question arose whether Python still is the best (I am aware that there is no strict order on the fit of programming languages to the realm I described) choice.

Get Ready

I am the typical JetBrains user (RubyMine, PyCharm, IntelliJ), but for Rust, I aimed to try something new: VSCode. Once you properly installed Rust (which gives you rustc, the Rust compiler, and cargo, the Rust package manager), everything I needed in VSCode was a proper font with neat ligatures and a couple of plugins: Rust, crates, Better TOML, Bracket Pair Colorizer 2.

Hello Rust

Rust has been close to if not top of the list of programming languages I wanted to learn and apply to real-life problems for a couple of years. Now seemed to be the ideal moment to give it a try. Before going into more detail, comparing Rusts concepts to approaches in other programming languages, let’s see how the above Python example would read when translated as-is to Rust:

pub fn read_packet(
    conn: &mut dyn prelude::SerialPort,
    checksum: fn(&[u8]) -> u16,
    timeout: Option<time::Duration>,
) -> Result<Vec<u8>, &str> {
    ...
}

Couple of noteworthy points are in this signature compared to its Python version:

  • The function explicitly requests mutable—thus exclusive!—access to the serial connection. This is particularly important as our serial connection can only be used by one process a time! This intent is not formulated explicitly in the Python code and could lead to runtime errors if this implicit contract is not obeyed.
  • Checksum function parameter looks essentially identical with the exception that we have more fine-grained control over the types. While in the Python version we can only state that we except a series of bytes (unrestricted) as the return value, the Rust version defines this contract more precisely as exactly one unsigned word (16 bit).
  • The timeout parameter is essentially the same in both samples except for the lack of default arguments in Rust. I find this a bit inconvenient at first glance.

Concerning the (unfortunate) lack of default arguments, three approaches appear to be most common to bypass this restriction:

Potpourri of Concepts

Rust is neither Python nor C++ and certainly not “something in between”. Instead, Rust comes with its own philosophy and bag of language concepts. Be warned that the following subsections do contain a biased selection from my first attempts to get an idea of how Rust works. 🙂 These notes are incomplete. For what I mention, however, I try to relate it concepts in Python (or various other languages I like).

Data Types

Rust comes with types you would expect from your general programming language: booleans, numbers (integers, floating-point), strings, collections (lists, maps, structs), enums. Like any good programming language close(r) to the system, integers come in a variety of flavors: signed and unsigned, and of different sizes (8 bit to 64 bit). Similar applies to floating-point numbers. Examples:

fn main() {
  let measure: bool = true;
  let port: u8 = 13u8; // 8 bit (unsigned) port number
  let baud_rate: u32 = 9_600; // 16 bit (unsigned) baud rate
  let temperature: f32 = 22.3; // 32 bit IEEE-754 floating-point number
}

It starts getting more interesting with collections. Sequences of objects come in three flavors: tuples, arrays, and vectors. Tuples are of fixed length and potentially mixed type. Arrays are single-type fixed-size collections, with vectors being their variable-size counterpart.1 Examples:

fn main() {
  let temperatures: Vec<f32> = vec![];
  // Measure temperatures for a while, pushing new values onto `temperatures`
  
  let temperatures_sliding_win: &[u8] = ... // TODO
  // TODO Add a good example for [f32; 10] (or alike)
  let status: (f32, bool) = (22.3, true);
  println!("Average temperature: {}. Is it up-to-date? {}", status.0, status.1);
}

Another notable difference between arrays and vectors is where they (read: their contents) are stored. As the size of an array is known at compile-time, they will be stored on the stack. Vectors, however, usually do not have a size known at compile-time, thus they (their contents) will be stored on the heap. Please refer to the official Rust documentation for a refresh of these memory management concepts.

Ownership

The concept I most looked forward to when starting Rust for a simple reason: Mutating state is usually quite easy in most languages I worked with to date, whereas managing state (including who is allowed to mutate it under what conditions) is not. Having a language that has the management of ownership at its core just felt right.

So what is ownership? It is the guarantee that at any given point in time, any value has exactly one owner. Once the value’s owner disappears (i.e., end of scope), the value ceases to exist. When passing values around, e.g., as arguments in functions calls, you have to make your intent about ownership explicit in Rust. For instance, let’s say we have a vector of measurement data. When passing it as-is to another function, we move ownership to the callee, thus being unable to access it afterward. When starting with Rust, this can lead to enlightening moments like this:

pub fn own(data: Vec<f32>) {
    println!("Data: {:?}", data);
    // +-------------------^
    // | owner of `data`; will cease to exist when function scope ends
}
pub fn main() {
    let data: Vec<f32> = vec![1.2, 2.4, 3.6, 4.8];
    own(data);
    //  ^--- ownership of `data` moved here to function `own`
    println!("Data: {:?}", data);
    //                     ^--- access to `data` will fail here since we don't own it anymore
}

What we should have done instead is to lend data to the own function rather than passing ownership to it—i.e., change own so that it borrows data rather than consuming it. All it takes is to make data a read-only reference:

pub fn own(data: &Vec<f32>) {
    // ...
}
pub fn main() {
    let data: Vec<f32> = vec![1.2, 2.4, 3.6, 4.8];
    own(&data);
    // ...
}

When coming from C++, seeing references being read-only by default is kind of a surprise. To pass (or expect) a mutable reference, use &mut instead of &.

Contracts via Traits

Traits are for formulating contracts on data: being iterable, clonable, drawable, …; you get the point. It sounds similar to the concept of interfaces. In Python, similar could be achieved with abstract base classes (ABC), although being un-pythonic. More pythonic would be to rely on structural subtyping by employing protocols (added to the typing module with Python 3.8; or from typing_extensions in earlier Python versions). An example:

from typing import List
from typing_extensions import Protocol


class Device(Protocol):
    def connect(self, reconnect: bool) -> None:
    	...
    
    def disconnect(self) -> None:
        ...

        
class RaspberryPi:
    def __init__(self, name: str) -> None:
        self.name = name
        self._connected = False
    
    def connect(self, reconnect: bool = True) -> None:
        if self._connected and reconnect:
            # reconnect...
            pass
        self._connected = True

    def disconnect(self) -> None:
        self._connected = False


devices: List[Device] = [RaspberryPi("Model 3B+")]

The RaspberryPi class fulfills the Device protocol without us specifying it explicitly (i.e., inheriting from Device). Translated to Rust, the Device protocol will become a trait, and the data contained in the concrete RaspberryPi class will be moved to a struct:

pub trait Device {
    fn connect(&mut self) -> ();
    fn disconnect(&mut self) -> ();
}

pub struct RaspberryPi {
    pub name: String,
    connected: bool,
}

impl Device for RaspberryPi {
    fn connect(&mut self) -> () {
        self.connected = true;
    }

    fn disconnect(&mut self) -> () {
        self.connected = false;
    }
}

Did you notice? In the above Rust version, we did use a struct with functions defined on it rather than combining data and functions to classes. That’s not an accident: Rust does not have classes. However, this does not imply that Rust would lack most of the underlying concepts that one usually associates with classes:

  • combination of data with functions (behavior) to operate thereon;
  • establishing hierarchical relationships to (amongst others) share data and behavior
  • generalizing functions to work with different types of data (polymorphism).

In Rust, there are trait objects (like the Device above) for formulating contracts to achieve polymorphism. Instead of relying on inheritance (of mixins when applicable), behavior is shared via default trait implementations. Just add an implementation (function body) to a function inside a trait.

What about if we want to add a reconnect function to our example above? We could add it to the trait itself with a default implementation, but we can also add it independently. In Python, this would look something like this:

def reconnect(device: Device) -> None:
    device.disconnect()
    device.connect()

Key here is that the device parameter is expected implement the Device protocol. In Rust, the corresponding concept is to define a function generic in a type T. Just like in the Python case, this type T should be bound to (read: restricted to subtypes of) Device:

pub fn reconnect<T: Device>(device: &mut T) -> () {
       // ^--- function generic in type `T`, with `T` bounded by `Device`
    device.disconnect();
    device.connect();
}

Testing

Unit tests are co-located with their respective code. However, they don’t just float around in the code, but are organized into a tests module and marked with #[cfg(tests)]. Example:

pub fn pp_bytes(bytes: &[u8]) -> String {
    bytes
        .iter()
        .map(|byte| format!("{:02X}", byte))
        .collect::<Vec<String>>()
        .join(" ")
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_pp_bytes_empty() {
        assert_eq!(pp_bytes(&[]), "")
    }

    #[test]
    fn test_pp_bytes_non_empty() {
        assert_eq!(pp_bytes(&[0, 1, 2, 3]), "00 01 02 03")
    }
}

Integration tests, on the other hand, go into a top-level tests folder. Doctests are also already built-in, just like in Python and Elixir. Let’s try to convert our sole init test from above for pp_bytes directly into a docstring: example (assuming the library is called serial_lib):

/// Pretty-print a slice of bytes as space-separated two-digit hex values.
/// 
/// # Examples
/// 
/// An empty input given an empty string:
/// 
/// ```
/// let pp = serial_lib::pp_bytes(&[]);
/// assert_eq!(pp, "");
/// ```
/// 
/// A non-empty collection of bytes yields a space-separated string of two-digit hex values:
/// 
/// ```
/// let pp = serial_lib::pp_bytes(&[0, 1, 2, 3]);
/// assert_eq!(pp, "00 01 02 03");
/// ```
pub fn pp_bytes(bytes: &[u8]) -> String {
    bytes
        .iter()
        .map(|byte| format!("{:02X}", byte))
        .collect::<Vec<String>>()
        .join(" ")
}

Run cargo test to check if the tests are actually recognized and executed:

$ cargo test
   Compiling serial_lib v0.1.0 (~/serial_lib)
    Finished test [unoptimized + debuginfo] target(s) in 0.63s
     Running target/debug/deps/serial_lib-371b240b0d2136bb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests serial_lib

running 2 tests
test src/lib.rs - pp_bytes (line 98) ... ok
test src/lib.rs - pp_bytes (line 105) ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

When it comes to testing frameworks, there is nothing close to pytest in the Rust world but rather a whole zoo of frameworks. The most notable features I currently miss are parameterized tests (but it’s apparently not the only one). They can partly be achieved by employing macros to roll your own parametrization. The landscape of mocking libraries is also still quite divers. From the little I have read so far, mockall appears to be worth looking into.

Where to go from here?

After the first very, very brief look over the Rust language, the official Rust book, and a few other resources, it’s imminent that Rust offers lots of concepts to learn, irrespective of learning about how to write idiomatic Rust. As that’s rather abstract, let’s get more concrete. Next for me to learn Rust is rebuilding some of the protocols mentioned in the introduction of this article with a focus on efficiency and non-blocking IO.

  1. That’s actually a didactic lie. :-D Vectors can be of mixed type by using trait objects: Vec<Box<dyn MyTrait>>.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.