Symex: resolve pointer comparisons using the value-set#4444
Conversation
src/goto-symex/goto_symex.h
Outdated
| statet &, | ||
| bool keep_array); | ||
|
|
||
| exprt try_evaluate_pointer_comparisons( |
There was a problem hiding this comment.
Any suggestions for a better name?
src/goto-symex/goto_symex.h
Outdated
| const exprt &new_guard, | ||
| const namespacet &ns); | ||
|
|
||
| optionalt<exprt> build_reference_to_value_set_element( |
There was a problem hiding this comment.
This function might fit better somewhere else - perhaps in value_set_dereferencet, since it's mainly a wrapper around value_set_dereferencet::build_reference_to?
There was a problem hiding this comment.
Proooobably. No strong opinion.
src/goto-symex/symex_goto.cpp
Outdated
| op = try_evaluate_pointer_comparisons(std::move(op), value_set); | ||
| } | ||
| } | ||
| else if(condition.id() == ID_not) |
There was a problem hiding this comment.
anything else I should be recursing over, other than AND, OR and NOT?
There was a problem hiding this comment.
Oh wait, how about the arguments of a boolean-typed if_exprt
There was a problem hiding this comment.
(e.g. x == y ? z == w : a == b)
There was a problem hiding this comment.
Or indeed the first arg of an if_exprt in any context. Total recursion looking for == and != nodes might be the simplest thing.
There was a problem hiding this comment.
I 100% think these comments through before beginning to write FYI
src/goto-symex/symex_goto.cpp
Outdated
| { | ||
| symbol_exprt *symbol_expr = | ||
| expr_try_dynamic_cast<symbol_exprt>(condition.op0()); | ||
| exprt &constant = condition.op1(); |
There was a problem hiding this comment.
Currently it assumes the symbol is on the lhs and the constant is on the rhs. I will fix that up.
src/goto-symex/symex_main.cpp
Outdated
| return return_value; | ||
| } | ||
|
|
||
| optionalt<exprt> goto_symext::build_reference_to_value_set_element( |
There was a problem hiding this comment.
This code was in goto_symext::try_filter_value_sets and I've pulled it out into a function to reuse it. Perhaps I'll make that a separate commit.
smowton
left a comment
There was a problem hiding this comment.
Looks good except I believe the null case is broken, and at least needs tests.
| possibly_reachable_12 = 7; | ||
|
|
||
| int unconditionally_reachable_13; | ||
| if((ptr_to_a == &a && c == 3) || c == 0) |
There was a problem hiding this comment.
Bear in mind these constant comparisons with c may well be resolved very early as c is only assigned once and never address-taken.
There was a problem hiding this comment.
I've changed them to other pointer comparisons. And I now take the address of all the integers at the beginning, if that's a thing that symex cares about.
src/goto-symex/symex_goto.cpp
Outdated
| expr_try_dynamic_cast<symbol_exprt>(condition.op0()); | ||
| exprt &constant = condition.op1(); | ||
|
|
||
| if(symbol_expr && goto_symex_is_constantt()(constant)) |
There was a problem hiding this comment.
Handle the reverse case too (constant OP symbol)
src/goto-symex/symex_goto.cpp
Outdated
| if(symbol_expr && goto_symex_is_constantt()(constant)) | ||
| { | ||
| value_setst::valuest value_set_elements; | ||
| value_set.get_value_set( |
There was a problem hiding this comment.
Assert that symbol_expr is already L1 at least
src/goto-symex/symex_goto.cpp
Outdated
| to_ssa_expr(*symbol_expr).get_l1_object(), value_set_elements, ns); | ||
|
|
||
| INVARIANT( | ||
| constant.id() == ID_address_of || constant.id() == ID_NULL, |
There was a problem hiding this comment.
Surprising! Null is usually represented as an ID_constant with value = NULL -- suspect you need to add tests for this
src/goto-symex/symex_goto.cpp
Outdated
|
|
||
| if(!unknown_or_invalid_found) | ||
| { | ||
| if(!constant_found) |
There was a problem hiding this comment.
Give some brief comments setting out the logic here, as !constant_found and value_set_elements.size() == 1 don't immediately chime with "x in { a, b, c }" and "x in { x }" respectively.
618b074 to
1081b00
Compare
|
I've addressed all of @smowton's review comments on the draft PR and got CI working (touch wood). |
1081b00 to
eb55978
Compare
|
|
||
| // Equality "==" | ||
|
|
||
| // A non-null pointer and the correct value |
There was a problem hiding this comment.
correct -> matching and incorrect -> non-matching everywhere
|
|
||
| // Disequality "!=" | ||
|
|
||
| // A non-null pointer and the correct value |
| test::1::unreachable_12[^\s]+ = 7$ | ||
| ^warning: ignoring | ||
| -- | ||
| Pointer comparisons will be resolved in symex by a mixture of constant |
There was a problem hiding this comment.
Suggest adding conditions that check you check for everything you should, by making sure unreachable_1[3-9]) and unreachable[2-9][0-9] for example don't occur anywhere (i.e., if someone added a test but not a cross-check this regex would complain)
src/goto-symex/symex_goto.cpp
Outdated
|
|
||
| INVARIANT( | ||
| other_operand.id() == ID_address_of || | ||
| (typecast_expr && typecast_expr->op().id() == ID_address_of) || |
There was a problem hiding this comment.
If you're not sure this is simplified you might get redundant stacked typecast ops
src/goto-symex/symex_goto.cpp
Outdated
| expr_try_dynamic_cast<symbol_exprt>(symbol_expr); | ||
|
|
||
| if(!symbol_expr_lhs) | ||
| { |
There was a problem hiding this comment.
No need for braces with one-liner ifs
There was a problem hiding this comment.
I prefer to use them in all cases
src/goto-symex/symex_goto.cpp
Outdated
| /// \param value_set: The value-set for looking up what the symbol can point to | ||
| /// \param language_mode: The language mode | ||
| /// \param ns: A namespace | ||
| static bool try_evaluate_pointer_comparisons_base_case_with_check( |
There was a problem hiding this comment.
Instead of calling this twice, suggest something like https://github.com/diffblue/cbmc/blob/develop/src/goto-symex/goto_state.cpp#L67
src/goto-symex/symex_goto.cpp
Outdated
| /// \param value_set: The value-set for looking up what the symbol can point to | ||
| /// \param language_mode: The language mode | ||
| /// \param ns: A namespace | ||
| static void try_evaluate_pointer_comparisons_base_case( |
There was a problem hiding this comment.
How about try_evaluate_pointer_comparison, singular?
src/goto-symex/symex_goto.cpp
Outdated
| /// \param value_set: The value-set for looking up what the symbol can point to | ||
| /// \param language_mode: The language mode | ||
| /// \param ns: A namespace | ||
| static bool try_evaluate_pointer_comparisons_base_case_with_check( |
There was a problem hiding this comment.
I don't think there's much gained by the two functions (with_check vs. without) since they're both only called in one place
|
Note also that the last two commits break sharing everywhere, so we'll have to rethink how to do that. |
|
Suggested change: smowton@1daa2c6 That removes @owen-jones-diffblue note that also applies a couple of my suggestions above since it made it easier to code :) |
Yes that looks better. What I'm not sure about is the fact that the mutator function takes an |
eb55978 to
b20167c
Compare
|
Thanks @smowton, that's much better. I've cherry-picked it and squashed it in, along with fixes for all the things you mentioned. I did add an extra commit which makes There are three failing tests. The first is in test.desc and in vccs.desc. The other two are nondet_elements_longer_lists and nondet_elements_longer_lists_global, which are failing in the same way. I'm looking into that at the moment. |
|
For nondet_elements_longer_lists and nondet_elements_longer_lists_global, the test is failing to find these two lines: ``` |
|
@owen-jones-diffblue unsure about the unwinding assertions (yet) but the exception one is entirely desirable -- that 1 VCC remaining was a shortcoming which we have now fixed (symex can tell no exception escaped from the whole program). Just amend the test to check all conditions are symex'd away. |
|
@romainbrenguier I'm not sure I follow, what would your ideal interface look like? |
|
@owen-jones-diffblue Presumably your changes just result in more constant propagation, and thus those unwinding assertions are resolved by the simplifier already. #2574 does the same, i.e., removes those lines from the test spec. |
I would think having |
|
Ah sure, ok. On the one hand the tag is useful if you want to call a function that demands a |
b20167c to
44c25df
Compare
|
I've marked this as |
|
I've cherry-picked ##4454 and added a commit to use it. Before merging this should be cleaned up. |
|
cbmc CORE unit tests times: unchanged |
|
Large Webgoat example: 55 seconds to reach a particular checkpoint (down 5 seconds) |
96d2f30 to
13b6290
Compare
|
Romain's Sakai benchmark: One benchmark reduced around 15 seconds (from 145 seconds), the remainder tested showed negligible difference. |
|
Therefore I'm pretty confident this has no noticeable cost in context of the other fairly-expensive ops we perform in symex. |
13b6290 to
ff6002b
Compare
| const symbol_exprt &symbol_expr, | ||
| const exprt &other_operand, | ||
| const value_sett &value_set, | ||
| const irep_idt language_mode, |
src/goto-symex/symex_goto.cpp
Outdated
| /// \return If we were able to evaluate the condition as true or false then we | ||
| /// return that, otherwise we return an empty optionalt | ||
| static optionalt<renamedt<exprt, L2>> try_evaluate_pointer_comparison( | ||
| irep_idt operation, |
src/goto-symex/symex_goto.cpp
Outdated
| static optionalt<renamedt<exprt, L2>> try_evaluate_pointer_comparison( | ||
| const renamedt<exprt, L2> &renamed_expr, | ||
| const value_sett &value_set, | ||
| const irep_idt language_mode, |
src/goto-symex/symex_goto.cpp
Outdated
| static renamedt<exprt, L2> try_evaluate_pointer_comparisons( | ||
| renamedt<exprt, L2> condition, | ||
| const value_sett &value_set, | ||
| const irep_idt language_mode, |
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: ff6002b).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/111618837
value_set_dereferencet::valuet is only used as the return value of value_set_dereferencet::build_reference_to. valuet::value is an expression for the object pointed to, which is what's needed for dereferencing. In other contexts, e.g. filtering value-sets, what is needed is an expression for the pointer. This logic already existed in try_filter_value_set, but putting it directly in build_reference_to makes it easier to use it in other places and makes it clear that it didn't cope with typecasts properly.
Use the value-set to evaluate pointer comparisons as true or false where possible
At time of writing, develop fails on the following lines of the test: test::1::unconditionally_reachable_1[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_1[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_6[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_6[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_7[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_7[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_8[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_8[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_9[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_9[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_10[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_10[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_11[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_11[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_12[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_12[^\s]+\)$ [FAILED] test::1::unreachable_1[^\s]+ = 7$ [FAILED] test::1::unreachable_6[^\s]+ = 7$ [FAILED] test::1::unreachable_7[^\s]+ = 7$ [FAILED] test::1::unreachable_8[^\s]+ = 7$ [FAILED] test::1::unreachable_9[^\s]+ = 7$ [FAILED] test::1::unreachable_10[^\s]+ = 7$ [FAILED] test::1::unreachable_11[^\s]+ = 7$ [FAILED] test::1::unreachable_12[^\s]+ = 7$ [FAILED]
ff6002b to
213a1b2
Compare
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: 213a1b2).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/111750931
|
I'm afraid this actually causes spurious counterexamples being reported. I'll try to build a small regression test. |
|
Actually it's pretty simple: I believe the problem is that |
|
Discussion on Slack summary: will revert and re-submit with a fix |
This is to guard against the regression earlier introduced in diffblue#4444 (and reverted in diffblue#4656).
This allows us to evaluate boolean expressions (including GOTO conditions) that test pointer equality or inequality (particularly "ptr == null" or "!= null") where the value-set contains sufficient information. We can already handle simple cases where
ptrhas a single known alias via constant propagation and aliasing, but not cases whereptris non-constant, for example because it is drawn from an array.