Skip to content
Merged
22 changes: 22 additions & 0 deletions library/rmc/src/arbitrary.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0 OR MIT

//! This module introduces the Arbitrary trait as well as implementation for the Invariant trait.
use crate::{any_raw, assume, Invariant};

/// This trait should be used to generate symbolic variables that represent any valid value of
/// its type.
pub trait Arbitrary {
fn any() -> Self;
}

impl<T> Arbitrary for T
where
T: Invariant,
{
fn any() -> Self {
let value = unsafe { any_raw::<T>() };
assume(value.is_valid());
value
}
}
130 changes: 130 additions & 0 deletions library/rmc/src/invariant.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0 OR MIT

//! This module introduces the Invariant trait as well as implementation for commonly used types.
use std::num::*;

/// Types that implement a check to ensure its value is valid and safe to be used. See
/// https://doc.rust-lang.org/stable/nomicon/what-unsafe-does.html for examples of valid values.
///
/// Implementations of Invariant traits must ensure that the current bit values of the given type
/// is valid and that all its invariants hold.
///
/// # Safety
///
/// This trait is unsafe since &self might represent an invalid value. The `is_valid()` function
/// must return `true` if and only if the invariant of its type is held.
pub unsafe trait Invariant {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this trait itself need to be unsafe? I agree that it looks like it should be, but is there a way to do this without that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a suggestion from @camshaft which I thought made a lot of sense. So I would prefer keeping it.

Is there going to be a marker trait for the safe function? If so that should be unsafe to implement (similar to Send and Sync).

Copy link
Copy Markdown
Contributor

@camshaft camshaft Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking it would be something like:

/// Types that implement this can be generated without any constraints or invariants
pub unsafe trait NondetTransmute { }

pub fn any<T: NondetTransmute>() -> T {
    unsafe { any_raw() }
}

You would then implement that for the primitives.

I'm trying to decide if the unsafe makes sense in its current form, though... My concern is the &self argument is completely unvalidated and without knowing the RMC compiler internals, may result in UB when using it.

Is there a reason the trait method doesn't return an owned value instead? It would make the trait safe to implement and each implementation may or may not need to use unsafe to generate a value:

pub trait Arbitrary {
    fn any() -> Self;
}

impl Arbitrary for u8 {
    fn any() -> Self {
        unsafe { any_raw() }
    }
}

impl Arbitrary for char {
    fn any() -> Self {
        let v = unsafe { any_raw() };
        let v = char::from_u32(v);
        assume(v.is_ok());
        v.unwrap()
    }
}

struct MyThing {
    field: u8
}

impl Arbitrary for MyThing {
    fn any() - Self {
        Self { field: u8::any() }
    }
}

Copy link
Copy Markdown
Contributor

@camshaft camshaft Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another concern is: in its current form, it's very easy to miss validating a field when implementing the trait:

struct MyThing {
    field: char,
    other: char,
}

impl Invariant for MyThing {
    fn is_valid(&self) -> bool {
        self.field.is_valid() // missing `other`!
    }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. This is another possibility which uses a builder pattern. I thought the Invariant approach was a bit cleaner and easier to use. It also adds a method that can be used in assert statements which I thought was nice. I do agree that it has the downside of having a potential invalid Self and I think the unsafe impl is proper in this case.

I think they are somewhat equivalent though. For an arbitrary enum, it would go from:

unsafe impl Invariant for MyEnum {
    #[inline(always)]
    fn is_valid(&self) -> bool {
        matches!(*self, MyEnum::V1 | MyEnum::V2 | MyEnum::V3)
    }
}

to

impl Arbitrary for MyEnum {
    #[inline(always)]
    fn any() -> Self {
        match u8::any() {
            0 => MyEnum::V1,
            1 => MyEnum::V2,
            _ => MyEnum::V3,
        }
    }
}

Maybe we should check which one performs better or even offer both.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concern, and I debated between the two options for a long time. Maybe we can have a quick sync tomorrow to discuss the options.

Another possibility is to add both methods to the type with default implementations. Something like:

pub unsafe trait Arbitrary {
    fn is_valid(&self) -> bool { true }
    fn any() -> Self {
        let s = unsafe { rmc::any_raw::<Self>() };
        rmc::assume(s.is_valid());
        s
    } 
}

pub fn any<T: Arbritary>() -> T {
    T::any()
}

Users can pick and choose which one to implement / use.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm good to sync up whenever works for you!

/// Check if `&self` holds a valid value that respect the type invariant.
/// This function must return `true` if and only if `&self` is valid.
fn is_valid(&self) -> bool;
}

macro_rules! empty_invariant {
( $type: ty ) => {
unsafe impl Invariant for $type {
#[inline(always)]
fn is_valid(&self) -> bool {
true
}
}
};
}

empty_invariant!(u8);
empty_invariant!(u16);
empty_invariant!(u32);
empty_invariant!(u64);
empty_invariant!(u128);
empty_invariant!(usize);

empty_invariant!(i8);
empty_invariant!(i16);
empty_invariant!(i32);
empty_invariant!(i64);
empty_invariant!(i128);
empty_invariant!(isize);

// We do not constraint floating points values per type spec. Users must add assumptions to their
// verification code if they want to eliminate NaN, infinite, or subnormal.
empty_invariant!(f32);
empty_invariant!(f64);

empty_invariant!(());

unsafe impl Invariant for bool {
#[inline(always)]
fn is_valid(&self) -> bool {
let byte = u8::from(*self);
byte == 0 || byte == 1
}
}

/// Validate that a char is not outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
/// Ref: https://doc.rust-lang.org/stable/nomicon/what-unsafe-does.html
unsafe impl Invariant for char {
#[inline(always)]
fn is_valid(&self) -> bool {
// RMC translates char into i32.
let val = *self as i32;
val <= 0xD7FF || (val >= 0xE000 && val <= 0x10FFFF)
}
}

unsafe impl<T: Invariant, const N: usize> Invariant for [T; N] {
fn is_valid(&self) -> bool {
self.iter().all(|e| e.is_valid())
}
}

unsafe impl<T> Invariant for Option<T>
where
T: Invariant,
{
#[inline(always)]
fn is_valid(&self) -> bool {
if let Some(v) = self { v.is_valid() } else { matches!(*self, None) }
}
}

unsafe impl<T, E> Invariant for Result<T, E>
where
T: Invariant,
E: Invariant,
{
#[inline(always)]
fn is_valid(&self) -> bool {
if let Ok(v) = self {
v.is_valid()
} else if let Err(e) = self {
e.is_valid()
} else {
false
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumptions module handled a lot more types than the ones implemented here, e.g. enums. Do we lose assumptions for those other types?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do. For enums and structs, @danielsn and I were thinking about adding a custom derive: #726

For other cases, I would prefer being safe. I don't think we should infer anything based on the type of fields without any sort of user intervention.

Copy link
Copy Markdown
Contributor

@zhassan-aws zhassan-aws Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents: it would be inconvenient if the user needs to annotate all the enums in their code with #[derive(Invariant)]. Automatically handling that during codegen is more user friendly (assuming this was done in the assumptions module).


macro_rules! nonzero_invariant {
( $type: ty ) => {
unsafe impl Invariant for $type {
#[inline(always)]
fn is_valid(&self) -> bool {
self.get() != 0
}
}
};
}

nonzero_invariant!(NonZeroU8);
nonzero_invariant!(NonZeroU16);
nonzero_invariant!(NonZeroU32);
nonzero_invariant!(NonZeroU64);
nonzero_invariant!(NonZeroU128);
nonzero_invariant!(NonZeroUsize);

nonzero_invariant!(NonZeroI8);
nonzero_invariant!(NonZeroI16);
nonzero_invariant!(NonZeroI32);
nonzero_invariant!(NonZeroI64);
nonzero_invariant!(NonZeroI128);
nonzero_invariant!(NonZeroIsize);
60 changes: 52 additions & 8 deletions library/rmc/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,13 @@
// SPDX-License-Identifier: Apache-2.0 OR MIT
#![feature(rustc_attrs)] // Used for rustc_diagnostic_item.

pub mod arbitrary;
pub mod invariant;
pub mod slice;

pub use arbitrary::Arbitrary;
pub use invariant::Invariant;

/// Creates an assumption that will be valid after this statement run. Note that the assumption
/// will only be applied for paths that follow the assumption. If the assumption doesn't hold, the
/// program will exit successfully.
Expand All @@ -13,7 +18,7 @@ pub mod slice;
/// The code snippet below should never panic.
///
/// ```rust
/// let i : i32 = rmc::nondet();
/// let i : i32 = rmc::any();
/// rmc::assume(i > 10);
/// if i < 0 {
/// panic!("This will never panic");
Expand All @@ -23,30 +28,69 @@ pub mod slice;
/// The following code may panic though:
///
/// ```rust
/// let i : i32 = rmc::nondet();
/// let i : i32 = rmc::any();
/// assert!(i < 0, "This may panic and verification should fail.");
/// rmc::assume(i > 10);
/// ```
#[inline(never)]
#[rustc_diagnostic_item = "RmcAssume"]
pub fn assume(_cond: bool) {}

/// This creates an unconstrained value of type `T`. You can assign the return value of this
/// This creates an symbolic *valid* value of type `T`. You can assign the return value of this
/// function to a variable that you want to make symbolic.
///
/// # Example:
///
/// In the snippet below, we are verifying the behavior of the function `fn_under_verification`
/// under all possible i32 input values.
/// under all possible `NonZeroU8` input values, i.e., all possible `u8` values except zero.
///
/// ```rust
/// let inputA = rmc::nondet::<i32>();
/// let inputA = rmc::any::<std::num::NonZeroU8>();
/// fn_under_verification(inputA);
/// ```
///
/// Note: This is a safe construct and can only be used with types that implement the `Arbitrary`
/// trait. The Arbitrary trait is used to build a symbolic value that represents all possible
/// valid values for type `T`.
#[inline(always)]
pub fn any<T: Arbitrary>() -> T {
T::any()
}

/// This function creates an unconstrained value of type `T`. This may result in an invalid value.
///
/// # Example:
///
/// In the snippet below, we are verifying the behavior of the function `fn_under_verification`
/// under all possible values of char, including invalid ones that are greater than char::MAX.
///
/// ```rust
/// let inputA = unsafe { rmc::any_raw::<char>() };
/// fn_under_verification(inputA);
/// ```
///
/// # Safety
///
/// This function is unsafe and it may represent invalid `T` values which can lead to many
/// undesirable undefined behaviors. Users must validate that the symbolic variable respects
/// the type invariant as well as any other constraint relevant to their usage. E.g.:
///
/// ```rust
/// let c = unsafe { rmc::any_raw::char() };
/// rmc::assume(char::from_u32(c as u32).is_ok());
/// ```
///
#[rustc_diagnostic_item = "RmcAnyRaw"]
#[inline(never)]
pub unsafe fn any_raw<T>() -> T {
unimplemented!("RMC any_raw")
}

/// This function has been split into a safe and unsafe functions: `rmc::any` and `rmc::any_raw`.
#[deprecated]
#[inline(never)]
#[rustc_diagnostic_item = "RmcNonDet"]
pub fn nondet<T>() -> T {
unimplemented!("RMC nondet")
pub fn nondet<T: Arbitrary>() -> T {
any::<T>()
}

/// Function used in tests for cases where the condition is not always true.
Expand Down
28 changes: 14 additions & 14 deletions library/rmc/src/slice.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0 OR MIT
use crate::{assume, nondet};
use crate::{any, any_raw, assume};
use core::ops::{Deref, DerefMut};

/// Given an array `arr` of length `LENGTH`, this function returns a **valid**
Expand All @@ -12,23 +12,23 @@ use core::ops::{Deref, DerefMut};
///
/// ```rust
/// let arr = [1, 2, 3];
/// let slice = rmc::slice::nondet_slice_of_array(&arr);
/// let slice = rmc::slice::any_slice_of_array(&arr);
/// foo(slice); // where foo is a function that takes a slice and verifies a property about it
/// ```
pub fn nondet_slice_of_array<T, const LENGTH: usize>(arr: &[T; LENGTH]) -> &[T] {
let (from, to) = nondet_range::<LENGTH>();
pub fn any_slice_of_array<T, const LENGTH: usize>(arr: &[T; LENGTH]) -> &[T] {
let (from, to) = any_range::<LENGTH>();
&arr[from..to]
}

/// A mutable version of the previous function
pub fn nondet_slice_of_array_mut<T, const LENGTH: usize>(arr: &mut [T; LENGTH]) -> &mut [T] {
let (from, to) = nondet_range::<LENGTH>();
pub fn any_slice_of_array_mut<T, const LENGTH: usize>(arr: &mut [T; LENGTH]) -> &mut [T] {
let (from, to) = any_range::<LENGTH>();
&mut arr[from..to]
}

fn nondet_range<const LENGTH: usize>() -> (usize, usize) {
let from: usize = nondet();
let to: usize = nondet();
fn any_range<const LENGTH: usize>() -> (usize, usize) {
let from: usize = any();
let to: usize = any();
assume(to <= LENGTH);
assume(from <= to);
(from, to)
Expand All @@ -38,12 +38,12 @@ fn nondet_range<const LENGTH: usize>() -> (usize, usize) {
/// between `0..=MAX_SLICE_LENGTH` and with non-deterministic content. This is
/// useful in situations where one wants to verify that all slices with any
/// content and with a length up to `MAX_SLICE_LENGTH` satisfy a certain
/// property. Use `nondet_slice` to create an instance of this struct.
/// property. Use `any_slice` to create an instance of this struct.
///
/// # Example:
///
/// ```rust
/// let slice: rmc::slice::NonDetSlice<u8, 5> = rmc::slice::nondet_slice();
/// let slice: rmc::slice::NonDetSlice<u8, 5> = rmc::slice::any_slice();
/// foo(&slice); // where foo is a function that takes a slice and verifies a property about it
/// ```
pub struct NonDetSlice<T, const MAX_SLICE_LENGTH: usize> {
Expand All @@ -53,8 +53,8 @@ pub struct NonDetSlice<T, const MAX_SLICE_LENGTH: usize> {

impl<T, const MAX_SLICE_LENGTH: usize> NonDetSlice<T, MAX_SLICE_LENGTH> {
fn new() -> Self {
let arr: [T; MAX_SLICE_LENGTH] = nondet();
let slice_len: usize = nondet();
let arr: [T; MAX_SLICE_LENGTH] = unsafe { any_raw() };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to implement any for arrays, e.g.:

unsafe impl<T, const LEN: usize> Invariant for [T; LEN]
where
    T: Invariant,
{
    #[inline(always)]
    fn is_valid(&self) -> bool {
        let valid = true;
        for i in *self {
            valid &= i.is_valid();
        }
        valid
    }
}

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I am adding a variant of that. Thanks!

let slice_len: usize = any();
assume(slice_len <= MAX_SLICE_LENGTH);
Self { arr, slice_len }
}
Expand Down Expand Up @@ -82,6 +82,6 @@ impl<T, const MAX_SLICE_LENGTH: usize> DerefMut for NonDetSlice<T, MAX_SLICE_LEN
}
}

pub fn nondet_slice<T, const MAX_SLICE_LENGTH: usize>() -> NonDetSlice<T, MAX_SLICE_LENGTH> {
pub fn any_slice<T, const MAX_SLICE_LENGTH: usize>() -> NonDetSlice<T, MAX_SLICE_LENGTH> {
NonDetSlice::<T, MAX_SLICE_LENGTH>::new()
}
2 changes: 1 addition & 1 deletion rmc-docs/src/tutorial-first-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ The second command opens that report in your default browser (on mac, on linux d
From this report, we can find the trace of the failure and filter through it to find the relevant line (at present time, an unfortunate amount of generated code is present in the trace):

```
let x: u32 = rmc::nondet();
let x: u32 = rmc::any();
x = 1023u
```

Expand Down
12 changes: 6 additions & 6 deletions rmc-docs/src/tutorial-kinds-of-failure.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,26 +101,26 @@ Having run `rmc --visualize` and clicked on one of the failures to see a trace,

To navigate this trace to find the information you need, we recommend searching for things you expect to be somewhere in the trace:

1. Search the document for `rmc::nondet` or `variable_of_interest =` such as `size =`.
1. Search the document for `rmc::any` or `variable_of_interest =` such as `size =`.
We can use this to find out what example values lead to a problem.
In this case, where we just have a couple of `rmc::nondet` values in our proof harness, we can learn a lot just by seeing what these are.
In this case, where we just have a couple of `rmc::any` values in our proof harness, we can learn a lot just by seeing what these are.
In this trace we find (and the values you get may be different):

```
Step 23: Function main, File tests/bounds-check.rs, Line 43
let size: usize = rmc::nondet();
let size: usize = rmc::any();
size = 0ul

Step 27: Function main, File tests/bounds-check.rs, Line 45
let index: usize = rmc::nondet();
let index: usize = rmc::any();
index = 0ul

Step 36: Function main, File tests/bounds-check.rs, Line 43
let size: usize = rmc::nondet();
let size: usize = rmc::any();
size = 2464ul

Step 39: Function main, File tests/bounds-check.rs, Line 45
let index: usize = rmc::nondet();
let index: usize = rmc::any();
index = 2463ul
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ mod tests {
#[cfg(rmc)]
#[no_mangle]
fn main() {
let size: usize = rmc::nondet();
let size: usize = rmc::any();
rmc::assume(size < 4096);
let index: usize = rmc::nondet();
let index: usize = rmc::any();
let array: Vec<u32> = vec![0; size];
get_wrapped(index, &array);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ fn find_midpoint(low: u32, high: u32) -> u32 {
#[cfg(rmc)]
#[no_mangle]
fn main() {
let a: u32 = rmc::nondet();
let b: u32 = rmc::nondet();
let a: u32 = rmc::any();
let b: u32 = rmc::any();
find_midpoint(a, b);
}
// ANCHOR_END: rmc
Loading