Skip to main content

Command Palette

Search for a command to run...

std::unique_ptr and Rust

Updated
11 min read
std::unique_ptr  and Rust

If you are a first-time reader of my blog series, warm welcome to you! 😊 I recommend that you spend 30 seconds reading the Introduction, which explains how articles are created.

Takeaways

Today I would like to discuss about replacement for std::unique_ptr in Rust. We will focus on similarities and differences, along with some extensions that we find in Rust. After the reading, you shall:

  • have a clear path for transition from C++ to Rust when writing/rewriting code and looking for similar constructs,

  • know similarities between C++ and Rust when it comes to unique ownership of dynamic allocation

  • The seed to continue exploring Rust around this topic

Intro

Back in the old days, we all used to use plain old pointers in C++. Probably many of us knew that this always leads to many issues and came up with different abstractions to solve the problems until the C++ Standards Committee introduced std::unique_ptr in C++11 (with extensions in C++14). From the moment it landed in major compilers (i.e., GCC 4.5, who still remembers it?), it became the main way to manage dynamically allocated memory with ownership being taken care of.

By contrast, in Rust, pointers were never a real problem because, from the beginning and by design, they cannot be dereferenced1. Nevertheless, dynamic allocation and ownership of underlying objects were needed as a fundamental code construct that is used in almost all codebases. That's why the Box<T> type was provided.

Comparision

As you already know, the Box<T> is a direct replacement for std::unique_ptr<T>. Let's start with a comparison

Box<T>std::unique_ptr<T>
MovableYesYes
CopyableNoNo
Automatic memory deallocationYesYes
Constructable from previously allocated memoryNo1Yes
Can leak resourcesNo1Yes
Allow in-place constructionNo/YesYes
Can be empty (aka nullptr)NoYes

Let's have a detailed look at the above points

Movability and Copyability

In both languages, the compiler will make sure that once an object is created, you can have only one instance of it at a time. This is achieved by different means (in C++ by deleting copy & assignment operators, in Rust by borrowing rules and not providing Copy trait).

int main() {
    auto dynamicData = std::make_unique<Data>();
    auto newObject = dynamicData; // ❌ Will not compile, missing copy constructor

    auto movedObject = std::move(dynamicData);  // ✅ All fine, the owned memory was moved into this instance
    return 0;
}

And equivalent in Rust

struct Data {
    some_field: i32,
}

fn main() {
    let dynamic_data: Box<Data> = Box::new(Data { some_field: 0 });
    let new_object: Box<Data> = dynamic_data; // ✅ This is valid because ownership of the Box is moved to new_object
    // Copy cannot be made because Box does not implement the Copy trait
    // println!("{}", dynamic_data.some_field); // ❌ This would cause a compile-time error because dynamic_data has been moved to new_object
}

Automatic memory deallocation

As before, implementation, language, and compiler will ensure that once the object's lifetime is finished, the destructor (or Drop in Rust) will be called, which will then call destructor (or Drop) on held T and free the memory allocated by the system allocator.

The only subtle difference here is that std::unique_ptr supports custom deleters that can be part of the object instance and the type. This is currently not possible in stable Rust when using the plain Box type.

💡
The custom allocator support for Box is already in nightly (new_in), so it will land in stable release in some time

Constructable from previously allocated memory

In C++, std::unique_ptr lets us construct its instance from a previously created T*. Moreover, you can change the managed memory during its lifetime via the .reset(T*) method call. This opens up a whole bunch of checks that developers have to do to ensure that, after the creation of std::unique_ptr, the underlying memory is correct:

  • nullptr check on T* upon creation

  • ensure via external tooling or extensive reviews that reset is not called with nullptr, or each subsequent dereference has a nullptr check beforehand

In the end, construction from a raw pointer is still dangerous because you also need to ensure that no one else has this pointer and will not attempt to release it. We can clearly see that even though C++ provided constructs that solve a lot of pointer issues for us, it's still error-prone, easy to misuse, and becomes tricky to catch in review or during the maintenance period as bugs emerge.

In Rust, the Box cannot be created from previously allocated memory1. That's it.

So simple, so safe. Box will take care of allocation, take over the pointer, and hide it inside so you cannot misuse it. No nullptr checks, no hard-to-spot replacement of managed objects. You are fully covered.

Let’s head into an example

#include <iostream>
#include <string>

struct Data {
  int32_t someData;
};

int main() {
  auto * instance = new Data();
  std::unique_ptr < Data > ptr {
    instance
  };

  std::cout << ptr -> someData << std::endl; // ✅ All good for now
  // Some time later in code base
  ptr.reset(); // ❌ Next `ptr` dereference will be undefined behavior 
  std::cout << ptr -> someData << std::endl; // ❌ As above, will lead to crash or not - depends on compiler (try it in ccp.sh and godbolt)

  std::unique_ptr < Data > ptrOther {
    nullptr
  }; // ❌ Same issue as after reset, `ptrOther` contains nullptr so dereferencing is undefined behavior 

  std::cout << ptrOther -> someData << std::endl;
}

And equivalent in Rust

// None of these errors can happen in Rust 🎊

Can leak resources

Using std::unique_ptr, one can leak resources by simply calling release. This gives back a pointer to the user (so now let's hope we remember to release it) and potentially leaves the instance with nullptr. So we are then in: Can be empty

Box does not allow a leak of the resource1. You can only obtain a reference to it. There is no way that we will end up with dangling pointers, unfreed memory, or anything else.

Can be empty (aka nullptr)

As in previous points, std::unique_ptr it can be constructed from any pointer, so it can be nullptr ! As mentioned, it is cumbersome to prove that once you are dereferencing data inside it, it is ensured that you are accessing the correct memory location.

In Rust, you are covered. No way to have nullptr in Box or a wrong memory location1.

Unsafe Rust

Everything that was written above for Box is true as long as you are using safe Rust. However, in practice, many of the mentioned operations, like releasing resources, are needed in complex implementations. That's why Box also provides a set of unsafe APIs, which are similar to std::unique_ptr. The core difference is that such usages have to be clearly annotated in code via unsafe {} blocks, which draw attention during reviews. Additionally, in practice, this is needed only by some libraries that are later used widely, so the majority of us will simply never need to use it, staying SAFE ☺️.

💡
If you are curious what more powers we get once we go to unsafe Rust, check it out here

Rust extensions

Box is a fundamental code unit in Rust. It's a built-in compiler implementation with additional guarantees that std::unique_ptr does not have. For example, Box is guaranteed to have the same memory footprint as T* with no overhead.

Additionally, Box provides a very powerful and robust API, well-integrated with custom and base types, allowing conversions, extraction, and other manipulations. One notable feature is the ability to provide an object API that depends on self being Box. This enables developers to model APIs that are only available for a given type once it's wrapped in Box.

struct Data {
    some_field: i32,
}

impl Data {
    /// This method is only available on `Box<Data>`
    fn available_on_boxed(self: &Box<Self>) -> i32 {
        self.some_field * 3
    }

    fn square(&self) -> i32 {
        self.some_field * self.some_field
    }
}

fn main() {
    let dynamic_data: Box<Data> = Box::new(Data { some_field: 4 });
    println!("{}", dynamic_data.available_on_boxed());
    println!("{}", dynamic_data.square());

    let data: Data = Data { some_field: 0 };
    //  println!("{}", data.available_on_boxed()); // Will cause `no method named `available_on_boxed` found for struct `Data` in the current scope`

    // Some other integrations
    let vector_data = vec![1, 2, 34];
    let boxed_vector_data: Box<[i32]> = vector_data.into_boxed_slice(); // Vector to dynamically allocated array (aka boxed slice)

    println!("{:?}", dynamic_data.some_field); // Accessing field directly, without a need to dereference as Rust does it automatically (Box implementes Deref trait)
}

Why would we need such feature at all?

Becasue it opens new, better possibilties to design API. One of examples could be a recursive patterns where You can change objects in between. Imagine a pseudo code:

trait SomeRecursiveTrait {
  fn next(self: Box<Self>) -> Box<dyn State>;
}

...
impl SomeRecursiveTrait for NodeA {
  fn next(self: Box<Self>) -> Box<dyn SomeRecursiveTrait> {
    Box::new(NodeA ::new(self.field1))
  }
}

impl SomeRecursiveTrait for NodeB {
  fn next(self: Box<Self>) -> Box<dyn SomeRecursiveTrait> {
    Box::new(NodeC::new(self.field2))
  }
}

This allows implementor to:

  • express that the object is consumed on each next call

  • Hide a real type returning dyn SomeRecursiveTrait and allow untyped abstraction and later use dynamic dispatch (via dyn Trait which will allow ie. storing it in containers like Vec)

  • Keeps the trait object safe (Box<dyn Trait> has known size at compile time so it’s Sized)

Placement allocation

Unlike C++, where placement new (new(ptr) T()) is available, Rust does not have placement allocation concept (yet) in its implementation. This can cause some issues when one wants to create a big object on the heap. Consider this:

#[derive(Clone, Copy)]
struct SomeType {
    value: u32,
}

struct BigData {
    data: [SomeType; 32000],
}

impl BigData {
    fn new() -> Self {
        BigData {
            data: [SomeType { value: 0 }; 32000],
        }
    }
}

fn main() {
    let instance = Box::new(BigData::new());
}

What will happen is that first, we will create BigData instance on the stack , then, once Box allocates memory, this instance will be moved into heap memory. This will probably not cause more issues on x86_64 targets with Unix OS, but once you try to run it in a more constrained environment like QNX on aarch64, it will likely cause stackoverflow. One of the solutions would be to increase the stack size for the particular thread where the instance is created, but this is probably not what the developer wants, as his data should be on the heap! Unfortunately, there is no out-of-the-box solution for this problem, however, proper use of Box API still enables that. Let’s have a look:

use std::mem::MaybeUninit;

#[derive(Clone, Copy)]
struct SomeType {
    value: u32,
}

struct BigData {
    data: [SomeType; 32000],
}

impl BigData {
    fn new() -> Self {
        BigData {
            data: [SomeType { value: 0 }; 32000],
        }
    }

    fn placement_new(mut memory: Box<MaybeUninit<BigData>>) -> Box<Self> {
        unsafe {
            // !!!!!!
            let data_ptr: *mut BigData = memory.as_mut_ptr(); // (3)
            let first_elem: *mut SomeType = &raw mut (*data_ptr).data as *mut SomeType;
            // In this casem, we will use an equivalent of memcpy to initialize our type
            std::ptr::write_bytes(first_elem, 0xAB, (*data_ptr).data.len()); // (4)

            memory.assume_init() // (5)
        }
    }
}

fn main() {
    let instance = Box::new(BigData::new()); // Uses stack memory

    let uninit_memory: Box<MaybeUninit<BigData>> = Box::new_uninit(); // (1)
    let instance_no_stack: Box<BigData> = BigData::placement_new(uninit_memory);
    // (2)
}

First, we create a Box that holds uninitialized memory (already allocated by system allocator) ready to accept our type (1). The sharp eye will catch that now our type is not Box, but Box<MaybeUninit>. The explanation for MaybeUninit would require a new post in this series. You only need to know for now that this is a piece of memory that is not initialized yet, and developers need to initialize it before claiming it really is. Then, filling of the area with actual data has to be coded by us (2) ! Next, we obtain a pointer to the region (3), fill it with our desired values (4), and then we claim that our data is ready to be used by assume_init (5). This way, we never put a byte from a BigData onto the stack. Those manipulations, especially a pointer derefernce between points 3 and 4, are inherently unsafe. That’s why they need to be wrapped into unsafe block. This will provide a clear indication to reviewers and readers that this piece of code in placement_new needs careful checking of all invariants to keep the rest of the code safe.

At the end, the purpose of this unsafe block is to allow certain manipulations of data, but also to check and remove all potential unsafe effects so that it can be used safely in the rest of the code base. This means that Box<Self> will still hold previous promises of not being null, being the only owner of allocated memory, or any other.

Summary

We have discussed the fundamental similarities and differences between std::unique_ptr and Box. As we can clearly see, thanks to the power of Rust, we can write the same low-level code as in C++, with the same overhead, while eliminating many fundamental problems during code development. This means that shipped code gains better quality immediately, takes less time to review, and does not introduce bugs that we would need to chase in production 😱.

Let me know what you think and what else you believe I should cover in this writing!

Notes unsafe

[1] - true, until using only safe Rust

Cpp to Rust developer journey

Part 1 of 2

This series is designed to guide C++ developers through the transition to Rust, highlighting the similarities they can leverage, the key differences they need to adapt to, and practical strategies for writing safe, efficient, and idiomatic Rust code.

Up next

Introduction

Warm welcome to you! 😊 I am really happy that You are checking out my series CPP to Rust developer journey 🦀💻. The goal is to make a smooth transition into Rust for C++ developers, highlighting the similarities and differences between the language...

More from this blog

Stack to Future

2 posts

I am Pawel, and you can also find me on GitHub, where I am a contributor to some well-known IPCs like iceoryx2 and new movements in the automotive world, such as S-CORE. Let's dive into Rust together!