Split rmc::nondet into safe and unsafe functions(#607) by celinval · Pull Request #657 · model-checking/kani

celinval · 2021-11-19T05:59:47Z

Description of changes:

The return value of rmc::nondet doesn't always respect the invariant of a type which could be confusing for the user. To fix the issue, we decided to provide two functions that create symbolic variables and one new trait that allows user to specify invariant of a type. The rmc::any::<T>() function generates a value that must satisfy the type invariant (specified by the new rmc::Invariant trait), while the unsafe function, rmc::any_raw::<T>() one generates an unconstrained return value which maps to any possible bit value assignment.

Resolved issues:

Resolves #607

Call-outs:

I kept the rmc::nondet::<T>() for now so I can finish updating the tests in a follow up PR.
I am still adding tests to the implementations of the Invariant. I'll update the PR once that it's done.

Testing:

How is this change tested? I modified all occurrences of the function in our tests.
Is this a refactor change? Yes

Checklist

Each commit message has a non-empty body, explaining why the change was made
Methods or procedures are documented
Regression or unit tests are included, or existing tests cover the modified code
My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

camshaft

Nice! I think this is the correct approach.

celinval · 2021-11-23T22:42:01Z

A few folks have brought up some concerns with the message unsafe {} statement actually provides. Since for some cases RMC does provide assurance that the value is valid.

Instead we decided to create two different functions, one that is safe and it should only represent valid values and another one that we will mark with unsafe that represents all possible bit value for the memory occupied by the object.

camshaft · 2021-11-23T23:32:46Z

Instead we decided to create two different functions, one that is safe and it should only represent valid values and another one that we will mark with unsafe that represents all possible bit value for the memory occupied by the object.

Is there going to be a marker trait for the safe function? If so that should be unsafe to implement (similar to Send and Sync).

The return value of rmc::nondet doesn't respect the invariant of a type. Until we have a better way to create symbolic variables, users must enforce the type invariant in their proof harness before using the value returned.

We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values.

Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Invariant type.) - `rmc::any_raw::<T>()` will provide any bitvalue possible for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Note: I still need to add tests to the implementations of Invariant we provide.

src/test/rmc/BinOp_Offset/main_fail.rs

adpaco-aws · 2022-01-04T20:44:08Z

src/test/expected/one-assert/expected

@@ -1 +1 @@
-** 0 of 1 failed (1 iterations)
+** 0 of 2 failed (1 iterations)


What is going on here? This was supposed to have a single assertion 😆

I know... we just can't keep this going, eh? We really need that report filter. This is due to a compiler sanity check. I could just use any_raw here to keep the 1 assertion.

Makes sense, no need to change for any_raw in my opinion.

My bad :( The purpose of this test was to make sure we don't get all the noisy successes for builtin functions, e.g.:

<builtin-library-calloc> function calloc [calloc.assertion.1] assertion false: SUCCESS <builtin-library-ceil> function ceil [ceil.assertion.1] assertion false: SUCCESS <builtin-library-ceilf> function ceilf [ceilf.assertion.1] assertion false: SUCCESS <builtin-library-copysign> function copysign [copysign.assertion.1] assertion false: SUCCESS ...

which we got rid of with --drop-unused-functions. So to make the test more robust, we need to check if the output does not include those lines. Is there a way to do that with our testing infrastructure?

At present, there is no way to do that. Not that it would be hard to, but I would wait until we get structured output to implement this kind of checks. The test is still ensuring there are only a couple of successful assertions so I would leave as is.

library/rmc/src/invariant.rs

camshaft · 2022-01-04T20:57:19Z

library/rmc/src/invariant.rs

+{
+    #[inline(always)]
+    fn is_valid(&self) -> bool {
+        match self {


Is it possible that the generated discriminant/tag is invalid and the match would be undefined behavior? Or does the RMC backend always generate valid enum tags?

I have updated the implementation to avoid the issue I believe you are talking about.

zhassan-aws

Looks great! What is unclear to me is my comment on the removal of the assumptions module.

library/rmc/src/invariant.rs

zhassan-aws · 2022-01-04T19:28:22Z

library/rmc/src/invariant.rs

+unsafe impl Invariant for u8 {
+    #[inline(always)]
+    fn is_valid(&self) -> bool {
+        true


Should we make this the default implementation in the trait definition to reduce boiler-plate code, or is that risky? If we do that, many of impl's would just be:

unsafe impl Invariant for u8 { } unsafe impl Invariant for u16 { } ...

I'll generate these with a macro. I don't know if we should provide a default implementation that returns true. Is the macro option OK?

A macro is fine.

library/rmc/src/lib.rs

zhassan-aws · 2022-01-04T20:31:38Z

library/rmc/src/slice.rs

    fn new() -> Self {
-        let arr: [T; MAX_SLICE_LENGTH] = nondet();
-        let slice_len: usize = nondet();
+        let arr: [T; MAX_SLICE_LENGTH] = unsafe { any_raw() };


I think we should be able to implement any for arrays, e.g.:

unsafe impl<T, const LEN: usize> Invariant for [T; LEN] where T: Invariant, { #[inline(always)] fn is_valid(&self) -> bool { let valid = true; for i in *self { valid &= i.is_valid(); } valid } }

?

Yes. I am adding a variant of that. Thanks!

zhassan-aws · 2022-01-04T21:25:46Z

library/rmc/src/invariant.rs

+            false
+        }
+    }
+}


The assumptions module handled a lot more types than the ones implemented here, e.g. enums. Do we lose assumptions for those other types?

Yes, we do. For enums and structs, @danielsn and I were thinking about adding a custom derive: #726

For other cases, I would prefer being safe. I don't think we should infer anything based on the type of fields without any sort of user intervention.

My two cents: it would be inconvenient if the user needs to annotate all the enums in their code with #[derive(Invariant)]. Automatically handling that during codegen is more user friendly (assuming this was done in the assumptions module).

zhassan-aws · 2022-01-04T21:31:11Z

src/test/firecracker/virtio-block-parse/main.rs

            }
        }
-        rmc::nondet()
+        if rmc::any() {


Is this change maintaining the old behavior?

From what I understood... any insight?

If we want to replace this with just:

rmc::any()

we would have to add an Invariant trait bound to T, is that correct? Not sure which option is better (assuming what I mentioned is correct).

I see what you mean. This does remove some constraints from the proof. That said, the proof still succeeds, so I don't think there is any harm done.

I'll add the Invariant and use rmc::any since that is the prefeed method.

src/test/rmc/DynTrait/main.rs

src/test/rmc/Pointers_OtherTypes/main.rs

src/test/rmc/TypeInvariant/char.rs_ignore

danielsn · 2022-01-04T19:45:18Z

library/rmc/src/invariant.rs

+///
+/// Implementations of Invariant traits must ensure that the current bit values of the given type
+/// is valid and that all its invariants hold.
+pub unsafe trait Invariant {


Does this trait itself need to be unsafe? I agree that it looks like it should be, but is there a way to do this without that?

That was a suggestion from @camshaft which I thought made a lot of sense. So I would prefer keeping it.

Is there going to be a marker trait for the safe function? If so that should be unsafe to implement (similar to Send and Sync).

I was originally thinking it would be something like:

/// Types that implement this can be generated without any constraints or invariants pub unsafe trait NondetTransmute { } pub fn any<T: NondetTransmute>() -> T { unsafe { any_raw() } }

You would then implement that for the primitives.

I'm trying to decide if the unsafe makes sense in its current form, though... My concern is the &self argument is completely unvalidated and without knowing the RMC compiler internals, may result in UB when using it.

Is there a reason the trait method doesn't return an owned value instead? It would make the trait safe to implement and each implementation may or may not need to use unsafe to generate a value:

pub trait Arbitrary { fn any() -> Self; } impl Arbitrary for u8 { fn any() -> Self { unsafe { any_raw() } } } impl Arbitrary for char { fn any() -> Self { let v = unsafe { any_raw() }; let v = char::from_u32(v); assume(v.is_ok()); v.unwrap() } } struct MyThing { field: u8 } impl Arbitrary for MyThing { fn any() - Self { Self { field: u8::any() } } }

Another concern is: in its current form, it's very easy to miss validating a field when implementing the trait:

struct MyThing { field: char, other: char, } impl Invariant for MyThing { fn is_valid(&self) -> bool { self.field.is_valid() // missing `other`! } }

Thanks for clarifying. This is another possibility which uses a builder pattern. I thought the Invariant approach was a bit cleaner and easier to use. It also adds a method that can be used in assert statements which I thought was nice. I do agree that it has the downside of having a potential invalid Self and I think the unsafe impl is proper in this case.

I think they are somewhat equivalent though. For an arbitrary enum, it would go from:

unsafe impl Invariant for MyEnum { #[inline(always)] fn is_valid(&self) -> bool { matches!(*self, MyEnum::V1 | MyEnum::V2 | MyEnum::V3) } }

to

impl Arbitrary for MyEnum { #[inline(always)] fn any() -> Self { match u8::any() { 0 => MyEnum::V1, 1 => MyEnum::V2, _ => MyEnum::V3, } } }

Maybe we should check which one performs better or even offer both.

I understand the concern, and I debated between the two options for a long time. Maybe we can have a quick sync tomorrow to discuss the options.

Another possibility is to add both methods to the type with default implementations. Something like:

pub unsafe trait Arbitrary { fn is_valid(&self) -> bool { true } fn any() -> Self { let s = unsafe { rmc::any_raw::<Self>() }; rmc::assume(s.is_valid()); s } } pub fn any<T: Arbritary>() -> T { T::any() }

Users can pick and choose which one to implement / use.

Yeah I'm good to sync up whenever works for you!

library/rmc/src/invariant.rs

danielsn · 2022-01-04T19:51:07Z

src/test/stub-tests/HashSet/ignore-nondet.rs

    // TODO: This test should ideally work with nondeterminstic values but for
    // for the moment it does not.


Is this comment still true?

I don't know.

danielsn · 2022-01-04T23:00:27Z

src/test/rmc/BinOp_Offset/main_fail.rs

-    let table: [[u64; 1]; 1] = [[rmc::nondet::<u64>()]];
-    table[0][rmc::nondet::<usize>()]; // EXPECTED FAIL
+    let table: [[u64; 1]; 1] = [[rmc::any::<u64>()]];
+    table[0][rmc::any::<usize>()]; // EXPCECTED FAIL


Is the turbofish needed here?

danielsn · 2022-01-04T23:01:00Z

src/test/rmc/Cast/from_be_bytes.rs

    let input: &[u8] = &vec![
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
-        rmc::nondet(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
+        rmc::any(),
    ];


Could we use any_slice

src/test/rmc/Enum/result3.rs

danielsn · 2022-01-04T23:02:17Z

src/test/firecracker/micro-http-parsed-request/ignore-main.rs

 // Should return a nondet string of up to n characters
 // Currently RMC does not support strings
-fn rmc::nondet_string(n: u32) -> String {
+fn rmc::any_string(n: u32) -> String {


Should this belong in the prelude?

We don't currently support generating strings. We should consider doing something similar to what @zhassan-aws did for slices. But I think this is out of the scope of this PR.

danielsn · 2022-01-04T23:02:57Z

library/rmc/src/invariant.rs

+// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
+// SPDX-License-Identifier: Apache-2.0 OR MIT
+
+/// This module introduces the Invariant trait and as well as implementation for commonly used types.


Are there any standard types missing here?

From the list of primitive types here, the types missing are:

never

fn

pointer

reference

slice

str

tuple

unit

Except for tuple, I don't see a use case for invariant. The tuple is a bit more painful to generate since rust doesn't have variadic templates.

In the next revision, I will include array which I was missing.

The standard library have a lot of other types that we are not implementing yet. Things with pointers and references are a bit trickier to implement. I think having a builder pattern for them will make more sense.

Some typos Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>

zhassan-aws · 2022-01-05T22:45:40Z

src/test/rmc/Invariants/nonzero.rs

+        rmc::expect_fail(v2.get() != 0, "Any raw may generate invalid value.");
+
+        let option = Some(v2);
+        rmc::expect_fail(option.is_some(), "Yep. This can happen due to niche optimization");


This is surprising!

celinval · 2022-01-07T07:02:01Z

I created the following issues to capture the mitigation tasks for the drawback of the current approach:
#726: Create custom derive attribute.
#735: Validate all fields and enum variants are checked in the Invariant implementation.

Conflicts: - src/rmc-compiler/rustc_codegen_rmc/src/codegen/assumptions.rs

celinval · 2022-01-07T22:41:43Z

I added the Arbitrary trait with an implementation for the Invariant trait per @camshaft and @danielsn suggestion. Thanks!

camshaft

I really like how this turned out! nice work!

zhassan-aws

Looks great!

…odel-checking#657) * Introduce rmc::any and rmc::any_raw and deprecate rmc::nondet We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values. Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Arbitrary trait.) - `rmc::any_raw::<T>()` will provide any possible bit value for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>

* Introduce rmc::any and rmc::any_raw and deprecate rmc::nondet We decided to replace the old method rmc::nondet because its behavior wasn't clear. For some types it would respect the type invariants but not for others. Additionally, there was no way for the user to specify how to create safe non-deterministic values. Users will be able to control how RMC generate symbolic values to their variables. - `rmc::any::<T>()` will provide a safe way to generate values that respect any invariant specified to type T (Given by the implementation of Arbitrary trait.) - `rmc::any_raw::<T>()` will provide any possible bit value for the memory layout of the type. This is an unsafe method and users must ensure correctness after this call. Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>

celinval requested a review from a team as a code owner November 19, 2021 05:59

celinval mentioned this pull request Nov 19, 2021

Create an RMC crate and load it as part of rmc-rustc #597

Merged

4 tasks

camshaft approved these changes Nov 19, 2021

View reviewed changes

celinval added 2 commits January 3, 2022 15:55

Mark rmc::nondet as unsafe (model-checking#607)

36e4f39

The return value of rmc::nondet doesn't respect the invariant of a type. Until we have a better way to create symbolic variables, users must enforce the type invariant in their proof harness before using the value returned.

celinval force-pushed the issue-607 branch 2 times, most recently from c9f3360 to 620c80e Compare January 4, 2022 03:26

celinval changed the title ~~Mark rmc::nondet as unsafe (#607)~~ Split rmc::nondet into two functions, one that respect invariants and one that doesn't (#607) Jan 4, 2022

celinval changed the title ~~Split rmc::nondet into two functions, one that respect invariants and one that doesn't (#607)~~ Split rmc::nondet into two functions, one that respect type invariant and one that doesn't (#607) Jan 4, 2022

celinval force-pushed the issue-607 branch from 620c80e to 16e7ba0 Compare January 4, 2022 03:57

celinval changed the title ~~Split rmc::nondet into two functions, one that respect type invariant and one that doesn't (#607)~~ Split rmc::nondet into safe and unsafe functions(#607) Jan 4, 2022

celinval requested a review from zhassan-aws January 4, 2022 18:19

adpaco-aws reviewed Jan 4, 2022

View reviewed changes

camshaft reviewed Jan 4, 2022

View reviewed changes

Added tests and fixed a few things in the Invariant impl

460dd98

zhassan-aws reviewed Jan 4, 2022

View reviewed changes

danielsn reviewed Jan 4, 2022

View reviewed changes

celinval and others added 3 commits January 4, 2022 17:31

Apply suggestions from code review

ca85f7e

Some typos Co-authored-by: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Co-authored-by: Adrian Palacios <73246657+adpaco-aws@users.noreply.github.com>

Address a few comments

9358103

Add support to NonZero* types

2d9cc0a

zhassan-aws reviewed Jan 5, 2022

View reviewed changes

celinval added 2 commits January 6, 2022 22:00

Update tests per PR comments.

2277f34

Merge branch 'main' into issue-607

3f6eadd

celinval added 2 commits January 7, 2022 14:31

Add Arbitrary trait

502c84e

Merge remote-tracking branch 'origin/main' into issue-607

0f17d25

Conflicts: - src/rmc-compiler/rustc_codegen_rmc/src/codegen/assumptions.rs

camshaft approved these changes Jan 8, 2022

View reviewed changes

zhassan-aws approved these changes Jan 8, 2022

View reviewed changes

celinval merged commit 176a9c6 into model-checking:main Jan 10, 2022

		@@ -1 +1 @@
		** 0 of 1 failed (1 iterations)
		** 0 of 2 failed (1 iterations)

		// TODO: This test should ideally work with nondeterminstic values but for
		// for the moment it does not.

Conversation

celinval commented Nov 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes:

Resolved issues:

Call-outs:

Testing:

Checklist

Uh oh!

camshaft left a comment

Choose a reason for hiding this comment

Uh oh!

celinval commented Nov 23, 2021

Uh oh!

camshaft commented Nov 23, 2021

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhassan-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhassan-aws Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camshaft Jan 5, 2022 • edited by zhassan-aws Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camshaft Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

celinval commented Nov 19, 2021 •

edited

Loading

zhassan-aws Jan 5, 2022 •

edited

Loading

camshaft Jan 5, 2022 •

edited by zhassan-aws

Loading

camshaft Jan 5, 2022 •

edited

Loading