A devil in Rust: Subtyping and Variance
Published:
At the first time I tried to learn the concept of variance, I failed to figure it out. The first reason was that I couldn’t combine the concept of lifetime. Another reason was that, it was difficult for me to apply the concept of “&'a mut T
is invariant over T
” to the real cases. Additionally, what a hell does “is invariant over” mean? Recently, I picked up this challenge again. After spending several days on reading articles and debugging, I finally figure something out, and I hope my explanation and experience could help more people who are stuck in the same places.
Subtype
I will not introduce lifetime in this blog post. The details of lifetime have been covered in official docs and rustonomicon. We already know that lifetime is a mark to help compilers figure out the cases they can’t understand. It’s just a mark, but it can’t affect the actual scope. For example, we use &'static T
to mark the type with 'static
lifetime; however, it is still dropped out of the scope {}
where we declared the type. Ok, back to the topic of subtype. It’s easy for us to understand the concept of “Subtype” in the language of inheritance such as Java. However, in our lovely Rust, subtype is only related to lifetime.
We already know that we can mark lifetime with the representation such as 'a
. But sometimes we can see two lifetime represented as 'a: 'b
. Rustonomicon[2] takes Cat: Animal
for example: Cat is a type of Animal. This is how subtype shows the relationship between two lifetimes: The lifetime 'a
is the subtype of lifetime 'b
. In other words, 'a
is at least longer or equal to 'b
.
How if the type is more complex than just T
? e.g., &'a mut T
, Vec<T>
, or fn(T)
. Compilers have the variance rules to derive the subtype relationship between these complex types.
Let’s get into the details of variance. Are you ready?
Variance
Following is the variance rules I mentioned above:
Actually I spent lots of time on staring at this table and don’t know what it means hahaha;) Take &'a mut T
and T
for example, it means that &'a mut T
is invariant over T
. This is the most important question I want to answer in this post, what does “invariant over” mean? In rustonomicon, it mentioned that variance is the attribute of the type constructor(F<T>
) over type(T
). We can imagine the type constructor of T
will return &'a mut T
to us.
We assume that a: b
, then there are three types of variance to show the relationship between F<a>
and F<b>
:
- covariant:
F<a>: F<b>
- contravariant:
F<b>: F<a>
- invariant: No relationship between
F<a>
amdF<b>
.
Now, how can we apply these rules to real cases?
// this is the example from rustonomicon
fn evil_feeder<T>(input: &mut T, val: T) {
*input = val;
}
fn main() {
let mut mr_snuggles: &'static str = "meow! :3";
{
let spike = String::from("bark! >:V");
let spike_str: &str = &spike;
evil_feeder(&mut mr_snuggles, spike_str);
}
println!("{}", mr_snuggles); // uaf?
}
Let’s analyze this example from the perspective of lifetime. From the signature of evil_feeder
, input
and val
are required to adopt the same based type. At call site, we pass &mut &'static str
and &'spike_str str
. Based on the rule &mut T
is invariant over T
, we can only pass exact type in the first argument, so T
could only be &'static str
. Based on the rule &'a T
is covariant over 'a
, we can pass the T
whose lifetime is the subtype of 'a
. Here comes the problem: From the first argument, compiler told us T
should be &'static str
, but we pass &'spike_str str
. 'spike_str
is not the subtype of 'static
, so compilation will fail.
Briefly summarize our steps here: We pass F<T>
. Look at the table, which one is our type constructor F
? After compiler adopts the constraint, how would it affect the other arguments?
Here comes another example[1]:
fn invariant<'a: 'b, 'b, 'c>(
sub: &'c mut Vec<&'a String>,
sup: &'c mut Vec<&'b String>,
) -> &'c mut Vec<&'b String> {
sub
}
It seems complex, but don’t worry. Let’s analyze it from the first line: 'a: 'b
. Then, we start from the type wrapped inside sub
and sup
.
- Based on the rule
&'a T
is covariant over'a
,&'a String: &'b String
. - Based on the rule
Vec<T>
is covariant overT
,Vec<&'a String>: Vec<&'b String>
. - Based on the rule
&'a mut T
is invariant overT
, no relationship between&'c mut Vec<&'a String>
and&'c mut Vec<&'b String>
.
Therefore, we can’t use sub: &'c mut Vec<&'a String>
to replace &'c mut Vec<&'b String>
as return type. Here, compilation will fail again.
Another example[1]:
struct S<T> {
f: fn(T),
}
fn test<'a>(
a: &mut S<&'a i32>,
b: &mut S<&'static i32>,
f1: fn(&'a i32),
f2: fn(&'static i32),
) {
a.f = f1;
a.f = f2;
}
First, we already know that &'static i32: &'a i32
.
- Based on the rule
fn(T)
is contravariant overT
,fn(&'a i32): fn(&'static i32)
. - At the line
a.f = f1
.a
is&mut S<&'a i32>
, soT
is&'a i32
.a.f
isfn(&'a i32)
, it is ok for us to assignf1: fn(&'a i32)
toa.f
. - At the line
a.f = f2
. It is not ok for us to assignf2: fn(&'static i32)
toa.f
.
Unbounded lifetime
Unsafe code usually produces lifetime out of air[4]. Deref raw pointer could produce unbounded lifetime. transmute
and transmute_copy
are two other offenders which could easily create unbounded lifetime. Here are some examples discussed in the zulip thread:
fn make_unbounded<'input, 'output>(input: &'input usize) -> &'output usize {
unsafe { std::mem::transmute(input) }
}
fn not_make_unbounded<'input>(input: &'input usize) -> &'input usize {
unsafe { std::mem::transmute(input) }
}
fn main() {
let x = Box::new(10);
let y = make_unbounded(&x);
// or
// let y = not_make_unbounded(&x);
drop(x);
println!("Dangling: {}", y);
}
In the code above, with function call make_unbounded(&x)
, it will print the dangling pointer y
in the last line. If changed to not_make_unbounded(&x)
, compiler will show that x
is already dropped and can’t be borrowed at the last line. The difference in not_make_unbounded
is that the lifetime of y
is bounded by (equal to) the lifetime of the reference to the x (and lifetime of &x
should be less than the lifetime of x
itself).
Conclusion
So why do we need subtype and variance?
Additionally, I recommend to read all the articles in reference. This post can be considered as my notes for these articles!
If you find something misunderstanding in the post or you want to discuss more cases with me, please don’t hesitate to contact me. Welcome to email me with the link on the sidebar.
Reference
- Some great examples from the link
- rustonomicon: Subtyping and Variance
- Common Rust Lifetime Misconceptions
- Unbounded lifetime