Split rmc::nondet into safe and unsafe functions(#607)#657
Split rmc::nondet into safe and unsafe functions(#607)#657celinval merged 11 commits intomodel-checking:mainfrom
Conversation
camshaft
left a comment
There was a problem hiding this comment.
Nice! I think this is the correct approach.
|
A few folks have brought up some concerns with the message Instead we decided to create two different functions, one that is safe and it should only represent valid values and another one that we will mark with unsafe that represents all possible bit value for the memory occupied by the object. |
Is there going to be a marker trait for the safe function? If so that should be |
The return value of rmc::nondet doesn't respect the invariant of a type. Until we have a better way to create symbolic variables, users must enforce the type invariant in their proof harness before using the value returned.
We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values.
c9f3360 to
620c80e
Compare
Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Invariant type.) - `rmc::any_raw::<T>()` will provide any bitvalue possible for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Note: I still need to add tests to the implementations of Invariant we provide.
| @@ -1 +1 @@ | |||
| ** 0 of 1 failed (1 iterations) | |||
| ** 0 of 2 failed (1 iterations) | |||
There was a problem hiding this comment.
What is going on here? This was supposed to have a single assertion 😆
There was a problem hiding this comment.
I know... we just can't keep this going, eh? We really need that report filter. This is due to a compiler sanity check. I could just use any_raw here to keep the 1 assertion.
There was a problem hiding this comment.
Makes sense, no need to change for any_raw in my opinion.
There was a problem hiding this comment.
My bad :( The purpose of this test was to make sure we don't get all the noisy successes for builtin functions, e.g.:
<builtin-library-calloc> function calloc
[calloc.assertion.1] assertion false: SUCCESS
<builtin-library-ceil> function ceil
[ceil.assertion.1] assertion false: SUCCESS
<builtin-library-ceilf> function ceilf
[ceilf.assertion.1] assertion false: SUCCESS
<builtin-library-copysign> function copysign
[copysign.assertion.1] assertion false: SUCCESS
...
which we got rid of with --drop-unused-functions. So to make the test more robust, we need to check if the output does not include those lines. Is there a way to do that with our testing infrastructure?
There was a problem hiding this comment.
At present, there is no way to do that. Not that it would be hard to, but I would wait until we get structured output to implement this kind of checks. The test is still ensuring there are only a couple of successful assertions so I would leave as is.
library/rmc/src/invariant.rs
Outdated
| { | ||
| #[inline(always)] | ||
| fn is_valid(&self) -> bool { | ||
| match self { |
There was a problem hiding this comment.
Is it possible that the generated discriminant/tag is invalid and the match would be undefined behavior? Or does the RMC backend always generate valid enum tags?
There was a problem hiding this comment.
I have updated the implementation to avoid the issue I believe you are talking about.
zhassan-aws
left a comment
There was a problem hiding this comment.
Looks great! What is unclear to me is my comment on the removal of the assumptions module.
library/rmc/src/invariant.rs
Outdated
| unsafe impl Invariant for u8 { | ||
| #[inline(always)] | ||
| fn is_valid(&self) -> bool { | ||
| true |
There was a problem hiding this comment.
Should we make this the default implementation in the trait definition to reduce boiler-plate code, or is that risky? If we do that, many of impl's would just be:
unsafe impl Invariant for u8 { }
unsafe impl Invariant for u16 { }
...There was a problem hiding this comment.
I'll generate these with a macro. I don't know if we should provide a default implementation that returns true. Is the macro option OK?
| fn new() -> Self { | ||
| let arr: [T; MAX_SLICE_LENGTH] = nondet(); | ||
| let slice_len: usize = nondet(); | ||
| let arr: [T; MAX_SLICE_LENGTH] = unsafe { any_raw() }; |
There was a problem hiding this comment.
I think we should be able to implement any for arrays, e.g.:
unsafe impl<T, const LEN: usize> Invariant for [T; LEN]
where
T: Invariant,
{
#[inline(always)]
fn is_valid(&self) -> bool {
let valid = true;
for i in *self {
valid &= i.is_valid();
}
valid
}
}?
There was a problem hiding this comment.
Yes. I am adding a variant of that. Thanks!
| false | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The assumptions module handled a lot more types than the ones implemented here, e.g. enums. Do we lose assumptions for those other types?
There was a problem hiding this comment.
My two cents: it would be inconvenient if the user needs to annotate all the enums in their code with #[derive(Invariant)]. Automatically handling that during codegen is more user friendly (assuming this was done in the assumptions module).
| } | ||
| } | ||
| rmc::nondet() | ||
| if rmc::any() { |
There was a problem hiding this comment.
Is this change maintaining the old behavior?
There was a problem hiding this comment.
From what I understood... any insight?
There was a problem hiding this comment.
If we want to replace this with just:
rmc::any()we would have to add an Invariant trait bound to T, is that correct? Not sure which option is better (assuming what I mentioned is correct).
There was a problem hiding this comment.
I see what you mean. This does remove some constraints from the proof. That said, the proof still succeeds, so I don't think there is any harm done.
I'll add the Invariant and use rmc::any since that is the prefeed method.
| /// | ||
| /// Implementations of Invariant traits must ensure that the current bit values of the given type | ||
| /// is valid and that all its invariants hold. | ||
| pub unsafe trait Invariant { |
There was a problem hiding this comment.
Does this trait itself need to be unsafe? I agree that it looks like it should be, but is there a way to do this without that?
There was a problem hiding this comment.
That was a suggestion from @camshaft which I thought made a lot of sense. So I would prefer keeping it.
Is there going to be a marker trait for the safe function? If so that should be
unsafeto implement (similar to Send and Sync).
There was a problem hiding this comment.
I was originally thinking it would be something like:
/// Types that implement this can be generated without any constraints or invariants
pub unsafe trait NondetTransmute { }
pub fn any<T: NondetTransmute>() -> T {
unsafe { any_raw() }
}You would then implement that for the primitives.
I'm trying to decide if the unsafe makes sense in its current form, though... My concern is the &self argument is completely unvalidated and without knowing the RMC compiler internals, may result in UB when using it.
Is there a reason the trait method doesn't return an owned value instead? It would make the trait safe to implement and each implementation may or may not need to use unsafe to generate a value:
pub trait Arbitrary {
fn any() -> Self;
}
impl Arbitrary for u8 {
fn any() -> Self {
unsafe { any_raw() }
}
}
impl Arbitrary for char {
fn any() -> Self {
let v = unsafe { any_raw() };
let v = char::from_u32(v);
assume(v.is_ok());
v.unwrap()
}
}
struct MyThing {
field: u8
}
impl Arbitrary for MyThing {
fn any() - Self {
Self { field: u8::any() }
}
}There was a problem hiding this comment.
Another concern is: in its current form, it's very easy to miss validating a field when implementing the trait:
struct MyThing {
field: char,
other: char,
}
impl Invariant for MyThing {
fn is_valid(&self) -> bool {
self.field.is_valid() // missing `other`!
}
}There was a problem hiding this comment.
Thanks for clarifying. This is another possibility which uses a builder pattern. I thought the Invariant approach was a bit cleaner and easier to use. It also adds a method that can be used in assert statements which I thought was nice. I do agree that it has the downside of having a potential invalid Self and I think the unsafe impl is proper in this case.
I think they are somewhat equivalent though. For an arbitrary enum, it would go from:
unsafe impl Invariant for MyEnum {
#[inline(always)]
fn is_valid(&self) -> bool {
matches!(*self, MyEnum::V1 | MyEnum::V2 | MyEnum::V3)
}
}to
impl Arbitrary for MyEnum {
#[inline(always)]
fn any() -> Self {
match u8::any() {
0 => MyEnum::V1,
1 => MyEnum::V2,
_ => MyEnum::V3,
}
}
}Maybe we should check which one performs better or even offer both.
There was a problem hiding this comment.
I understand the concern, and I debated between the two options for a long time. Maybe we can have a quick sync tomorrow to discuss the options.
Another possibility is to add both methods to the type with default implementations. Something like:
pub unsafe trait Arbitrary {
fn is_valid(&self) -> bool { true }
fn any() -> Self {
let s = unsafe { rmc::any_raw::<Self>() };
rmc::assume(s.is_valid());
s
}
}
pub fn any<T: Arbritary>() -> T {
T::any()
}Users can pick and choose which one to implement / use.
There was a problem hiding this comment.
Yeah I'm good to sync up whenever works for you!
| // TODO: This test should ideally work with nondeterminstic values but for | ||
| // for the moment it does not. |
There was a problem hiding this comment.
Is this comment still true?
| let table: [[u64; 1]; 1] = [[rmc::nondet::<u64>()]]; | ||
| table[0][rmc::nondet::<usize>()]; // EXPECTED FAIL | ||
| let table: [[u64; 1]; 1] = [[rmc::any::<u64>()]]; | ||
| table[0][rmc::any::<usize>()]; // EXPCECTED FAIL |
There was a problem hiding this comment.
Is the turbofish needed here?
| let input: &[u8] = &vec![ | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::nondet(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| rmc::any(), | ||
| ]; |
| // Should return a nondet string of up to n characters | ||
| // Currently RMC does not support strings | ||
| fn rmc::nondet_string(n: u32) -> String { | ||
| fn rmc::any_string(n: u32) -> String { |
There was a problem hiding this comment.
Should this belong in the prelude?
There was a problem hiding this comment.
We don't currently support generating strings. We should consider doing something similar to what @zhassan-aws did for slices. But I think this is out of the scope of this PR.
library/rmc/src/invariant.rs
Outdated
| // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
| // SPDX-License-Identifier: Apache-2.0 OR MIT | ||
|
|
||
| /// This module introduces the Invariant trait and as well as implementation for commonly used types. |
There was a problem hiding this comment.
Are there any standard types missing here?
There was a problem hiding this comment.
From the list of primitive types here, the types missing are:
- never
- fn
- pointer
- reference
- slice
- str
- tuple
- unit
Except for tuple, I don't see a use case for invariant. The tuple is a bit more painful to generate since rust doesn't have variadic templates.
In the next revision, I will include array which I was missing.
The standard library have a lot of other types that we are not implementing yet. Things with pointers and references are a bit trickier to implement. I think having a builder pattern for them will make more sense.
Some typos Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>
| rmc::expect_fail(v2.get() != 0, "Any raw may generate invalid value."); | ||
|
|
||
| let option = Some(v2); | ||
| rmc::expect_fail(option.is_some(), "Yep. This can happen due to niche optimization"); |
Conflicts:
- src/rmc-compiler/rustc_codegen_rmc/src/codegen/assumptions.rs
camshaft
left a comment
There was a problem hiding this comment.
I really like how this turned out! nice work!
…odel-checking#657) * Introduce rmc::any and rmc::any_raw and deprecate rmc::nondet We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values. Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Arbitrary trait.) - `rmc::any_raw::<T>()` will provide any possible bit value for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>
* Introduce rmc::any and rmc::any_raw and deprecate rmc::nondet We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values. Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Arbitrary trait.) - `rmc::any_raw::<T>()` will provide any possible bit value for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>
Description of changes:
The return value of rmc::nondet doesn't always respect the invariant of a type which could be confusing for the user. To fix the issue, we decided to provide two functions that create symbolic variables and one new trait that allows user to specify invariant of a type. The
rmc::any::<T>()function generates a value that must satisfy the type invariant (specified by the newrmc::Invarianttrait), while the unsafe function,rmc::any_raw::<T>()one generates an unconstrained return value which maps to any possible bit value assignment.Resolved issues:
Resolves #607
Call-outs:
rmc::nondet::<T>()for now so I can finish updating the tests in a follow up PR.Testing:
How is this change tested? I modified all occurrences of the function in our tests.
Is this a refactor change? Yes
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.