Rust vs Python. Rust impl & final results
Hello, dear readers,
Welcome to the second part of our series comparing the performance of Rust and Python when solving the same problem. If you missed the first part, you can find it here.
Rust is a powerful language that ensures you write correct and efficient code. The official website for Rust highlights the following key benefits:
Performance
Reliability
Productivity
The only notable downside of Rust is its steep learning curve. However, the compiler provides helpful suggestions, and with time, you will get accustomed to the borrow checker and ownership model.
Rust is highly praised for its ability to avoid bugs related to manual memory management. For example, you can check out Rust vs Common C++ Bugs. Rust also allows fearless concurrency (though it is race condition-free, it is not deadlock-free), has zero-cost abstractions, and can be compiled to WebAssembly. However, let's return to our main topic. The Rust version of our project relies on several tools:
cargo as the build system
cargo-llvm-cov to measure code coverage
criterion.rs for benchmarking
The full code for the Rust version is available at msgpack-py-vs-rs .
Data model
Let's define the Item
model that represents a single message:
use rmpv::Value;
#[derive(Debug, PartialEq)]
pub struct Item<'a> {
pub id: i32,
pub process_id: i32,
pub thread_id: i32,
pub timestamp_ns: i64,
pub line: i32,
pub value: f32,
pub filename: &'a str,
pub path: &'a str,
}
impl<'a> Item<'a> {
pub fn from_list(arr: &'a [Value]) -> Item<'a> {
Item {
id: arr[0].as_i64().unwrap() as i32,
process_id: arr[1].as_i64().unwrap() as i32,
thread_id: arr[2].as_i64().unwrap() as i32,
timestamp_ns: arr[3].as_i64().unwrap(),
line: arr[4].as_i64().unwrap() as i32,
value: arr[5].as_f64().unwrap() as f32,
filename: arr[6].as_str().unwrap(),
path: arr[7].as_str().unwrap(),
}
}
}
Let's recap what 'a
means here. The Item
struct has two fields, filename
and path
, which are references with the lifetime 'a
. This lifetime 'a
is a generic lifetime parameter that specifies how long the references in these fields are valid. The rest of the fields (id
, process_id
, thread_id
, timestamp_ns
, line
, value
) are not references, so they do not have lifetimes associated with them. When I create an instance of Item
, the 'a
lifetime must be connected to the lifetime of the data that filename
and path
point to. This means that the references filename
and path
cannot outlive the data they point to.
Method Item::from_list
accepts &'a [Value]
(a slice of Value
) with lifetime 'a
and returns Item<'a>
, this prevents any potential dangling references, ensuring that the Item
struct cannot outlive the data it references in the arr
slice.
I'm using &str
(string slice) instead of String
to avoid unnecessary copies, the actual string is already allocated before in Value
during parsing!
Parser
Let's define how we want to use the parser/reader from the caller side:
fn main() -> Result<()> {
let s = Instant::now();
let f = File::open("items.msgpack")?;
let rdr = BufReader::new(f);
let mut msgs: usize = 0;
let mut sum_of_ids: i64 = 0;
let mut parser: ItemMsgPackParser<BufReader<File>> = ItemMsgPackParser::new(rdr);
let on_next = |value: &Item| {
sum_of_ids += value.id as i64;
msgs += 1;
Ok(())
};
parser.parse(on_next)?;
println!("Read {msgs} messages, sum_of_ids is {sum_of_ids} in {} ms", s.elapsed().as_millis());
Ok(())
}
on_next
is a closure that accepts Item
as a reference and returns Result<()>
type.
Let's define generic MsgPackParser
type first
pub struct MsgPackParser<R> {
reader: R,
}
impl<R: std::io::Read> MsgPackParser<R> {
pub fn new(reader: R) -> Self {
MsgPackParser { reader }
}
pub fn parse(
&mut self,
mut on_next: impl FnMut(&Value) -> errors::Result<()>,
) -> errors::Result<()> {
loop {
match rmpv::decode::read_value(&mut self.reader) {
Ok(value) => on_next(&value)?,
Err(err) => {
if err.kind() != UnexpectedEof {
// In properly written msgpack files this should not happen, log and return error
warn!("Failed with err: {err}, kind: {}", err.kind());
return Err(errors::Error::from(err));
} else {
// Reached EOF, we can stop the loop
break;
}
}
}
}
Ok(())
}
}
As you may noticed it has generic type R
that must implement std::io::Read. This way I can pass anything that implements that trait, for example, std::fs::File or std::io::Cursor . This simplifies unit testing and benchmarking.
Note the signature of on_next
:
mut on_next: impl FnMut(&Value) -> errors::Result<()>
It must be FnMut so that provided closure on_next
can be called repeatedly and may mutate state.
MsgPackParser
is too generic, I want to have a type that knows how to convert rmpv::Value to our Item<'a>
, let's define a type ItemMsgPackParser<R>
for that. It uses MsgPackParser<R>
and Item::from_list
from above to be able to parse and convert generic Value
to Item
type:
pub struct ItemMsgPackParser<R> {
parser: MsgPackParser<R>,
}
impl<R: std::io::Read> ItemMsgPackParser<R> {
pub fn new(reader: R) -> Self {
ItemMsgPackParser {
parser: MsgPackParser::new(reader),
}
}
pub fn parse(
&mut self,
mut on_next: impl FnMut(&Item) -> errors::Result<()>,
) -> errors::Result<()> {
self.parser.parse(|value| match value {
Value::Array(arr) => {
let item = Item::from_list(arr.as_slice());
on_next(&item)?;
Ok(())
}
other => {
let t = match other {
Value::Nil => "Nil",
Value::Boolean(_) => "Boolean",
Value::Integer(_) => "Integer",
Value::F32(_) => "F32",
Value::F64(_) => "F64",
Value::String(_) => "String",
Value::Binary(_) => "Binary",
Value::Array(_) => "Array",
Value::Map(_) => "Map",
Value::Ext(_, _) => "Ext",
};
let msg = format!("Expected `Array` type but got `{}`", t);
return Err(Error::from(ErrorKind::ItemMsgPackParser(msg)));
}
})?;
Ok(())
}
}
Error handling
If you're new to Rust, you may noticed the question mark operator ?
. It unwraps valid values or returns errornous values, propagating them to the calling function, check The question mark operator, ?
to understand it better. The last piece of the code is the type errors::Result
. This type represents all known errors that can happen during parsing, it uses thiserror crate to simplify error handling:
use thiserror::Error;
#[derive(Error, Debug)]
#[error(transparent)]
pub struct Error(Box<ErrorKind>);
#[derive(Error, Debug)]
#[error(transparent)]
pub enum ErrorKind {
#[error("IoError: {0}")]
IoError(#[from] std::io::Error),
#[error("MsgPackDecodeError: {0}")]
MsgPackDecodeError(#[from] rmpv::decode::Error),
#[error("ItemMsgPackParser: {0}")]
ItemMsgPackParser(String),
}
impl<E> From<E> for Error
where
ErrorKind: From<E>,
{
fn from(err: E) -> Self {
Error(Box::new(ErrorKind::from(err)))
}
}
pub type Result<T> = std::result::Result<T, Error>;
You may notice that the ErrorKind
is boxed, the reason is to limit the maximum size of Result<T>
. The size of enum ErrorKind
is equal to the largest variant + padding, in some cases it can become quite large and method returning a type that contains ErrorKind
will need to create such large space in the stack for returning value (check my article Rust: enum, boxed error and stack size mystery) for deep dive on the topic.
Benchmark
Let's use criterion.rs to write a benchmark that parse msgpack from memory, the same way as did for Python. criterion.rs requires Rust project to be library, not binary. It is quite easy to add a new benchmark, requires two steps:
Modify rust/msgpack-core/Cargo.toml#L23 by adding
[[bench]]
sectionCreate
benches
folder and put the source code of bench with the name from[[bench]]
section
The code of the benchmark:
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use msgpack_core::msgpack_parser::ItemMsgPackParser;
use std::io::Cursor;
fn parse_items(bytes: &[u8]) {
let mut parser = ItemMsgPackParser::new(Cursor::new(bytes));
parser
.parse(|v| {
// Let's consume `v` using `black_box` to make sure compiler won't get rid of unused arg
black_box(v);
Ok(())
})
.unwrap();
}
pub fn criterion_benchmark(c: &mut Criterion) {
// The file contains exact same data as for Python benchmark
let bytes = include_bytes!("../test_resources/10000_items.msgpack").to_vec();
c.bench_function("in_memory_stream_benchmark for 10000 messages", |b| {
b.iter(|| parse_items(bytes.as_slice()))
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
Benchmark results
Initial results were quite odd for Windows OS, it was twice slower than on Ubuntu 20.04 that runs on the same Windows host machine with WSL2 , I suspected that memory allocator is the root cause, decided to add two allocators snmalloc-rs and jemallocator to verify the hypothesis.
The results are interesting, especially for Windows:
snmalloc-rs that uses Microsoft's snmalloc makes Rust code to run almost twice faster, 48.6%
snmalloc-rs outperforms jemallocator on Ubuntu 24.04 LTS as well
with default allocator the version compiled in Ubuntu 24.04 LTS is 35% faster than Windows 10
MAD is Median absolute deviation
SD is Standard deviation
Average throughput in MBytes/s is calculated from average message size 55.9439 bytes/msg (559439 bytes is the size of 10000 messages)
Comparison with Python
Let's combine the best results from Python and Rust to see which is faster. The winner is Rust (🦀) , it is 35 times faster than Python (🐍).
Comments and suggestions are welcome! Thank you for your time.