Improve performance and fix limitations of flatten_byte_operators [blocks: #3725]#2068
Conversation
04bc573 to
a1a41af
Compare
smowton
left a comment
There was a problem hiding this comment.
Reviewed up to byte_operators.cpp:578, to continue tomorrow
| #endif | ||
|
|
||
| error().source_location = expr.find_source_location(); | ||
| error() << "member: component type does not match: " << subtype.pretty() |
There was a problem hiding this comment.
Recommend newlines before pretty for readability
| error().source_location = expr.find_source_location(); | ||
| error() << "member: component type does not match: " << subtype.pretty() | ||
| << " vs. " << expr.type().pretty() << eom; | ||
| throw 0; |
There was a problem hiding this comment.
I think this is an invariant, as it's our responsibility to make sure this is true, not a user error
|
|
||
| return bv; | ||
| #if 0 | ||
| std::cout << "DEBUG " << expr.pretty() << "\n"; |
There was a problem hiding this comment.
If we're going to commit this, replace DEBUG with a more useful key?
| mp_integer num_elements; | ||
| if(to_integer(max_bytes, num_elements) && to_integer(src_size, num_elements)) | ||
| { | ||
| throw "cannot unpack array/vector of non-const size:\n" + |
There was a problem hiding this comment.
It's something that may be fixable in some way, and really a limitation of the implementation rather than a bug. So I'm inclined to leave it in place.
|
|
||
| mp_integer offset_int = 0; | ||
|
|
||
| if(element_width > 0 && element_width % 8 == 0) |
There was a problem hiding this comment.
What would element_width == 0 mean?
| if(!to_integer(offset, offset_int)) | ||
| { | ||
| // compute offset as number of elements | ||
| offset_int /= el_bytes; |
|
|
||
| exprt src_simp = simplify_expr(src, ns); | ||
|
|
||
| for(mp_integer i = offset_int; i < num_elements; ++i) |
There was a problem hiding this comment.
Given the above units trouble, i -> element_idx?
There was a problem hiding this comment.
I think I've made a right mess here. Thanks for looking carefully, I'm working on cleanup.
|
|
||
| if(src_simp.id() != ID_array || i >= src_simp.operands().size()) | ||
| { | ||
| index = index_exprt(src_simp, from_integer(i, index_type())); |
| new_offset = from_integer(new_offset_int, offset.type()); | ||
| } | ||
|
|
||
| if(!to_integer(max_bytes, new_max_bytes_int)) |
There was a problem hiding this comment.
Comment before these what the intended adjustment is? At the moment it looks like you would reduce both lower and upper bounds if the access is not byte-aligned, whereas I might expect you to round the lower bound down and the upper bound up to ensure you include the desired range?
There was a problem hiding this comment.
No, byte-alignedness is tested above, there shouldn't be a need for rounding I believe.
There was a problem hiding this comment.
In that case member_offset_bits % 8 is always zero, as it starts at zero and is the incremental sum of several element_widths, each of which is itself a multiple of eight.
There was a problem hiding this comment.
Your comment really highlighted that there was a problem: member_offset_bits wasn't actually being used properly, this should have been a division, not modulo!
| UNREACHABLE; | ||
|
|
||
| // TODO: consider ways of dealing with vectors of unknown subtype | ||
| // size or with a subtype size that does not fit byte boundaries |
179f937 to
5b8938e
Compare
| has_subtype(expr.op0().type(), ID_pointer, ns)) | ||
| { | ||
| exprt tmp = lower_byte_update(expr, ns); | ||
| return convert_bv(tmp); |
There was a problem hiding this comment.
const exprt tmp or just one line return
| if(has_byte_operator(expr)) | ||
| { | ||
| exprt tmp=flatten_byte_operators(expr, ns); | ||
| exprt tmp = lower_byte_operators(expr, ns); |
| it++) | ||
| for(struct_union_typet::componentst::const_iterator it = components.begin(); | ||
| it != components.end(); | ||
| ++it) |
There was a problem hiding this comment.
Use ranged for if it is not used as an iterator inside the loop
| offset+=sub_width; | ||
| DATA_INVARIANT( | ||
| base_type_eq(subtype, expr.type(), ns), | ||
| "component type expected to match expression type:\n" + expr.pretty()); |
There was a problem hiding this comment.
I think it's better to avoid calling pretty in places where it will be called even if there is no error.
There was a problem hiding this comment.
I'm not sure I understand the "called even if there is no error" part - this would only be printed in case of a violated invariant?!
There was a problem hiding this comment.
oh yes sorry, you can ignore my comment, the macro is so that the second part does not need to be called.
| offset + sub_width <= struct_bv.size(), "member access outside struct"); | ||
|
|
||
| for(std::size_t i = 0; i < sub_width; i++) | ||
| bv[i] = struct_bv[offset + i]; |
There was a problem hiding this comment.
I think you could use bv.insert(bv.begin(), struct_bv + offset, struct_bv + offset + sub_width) instead
There was a problem hiding this comment.
Good call! This actually enables a further simplification of just doing return bvt(struct_bv.begin() + ...).
83a50d6 to
95f06fb
Compare
|
|
||
| mp_integer index = 0; | ||
| if( | ||
| !offset_i.is_constant() || root.id() != ID_array || |
There was a problem hiding this comment.
Readability: suggest break after each disjunct
| if(sub_size==1) | ||
| { | ||
| byte_extract_exprt byte_extract_expr( | ||
| src.id()==ID_byte_update_little_endian? |
There was a problem hiding this comment.
might as well use conventional operator spacing here
| src.op2(), t); | ||
| else | ||
| value_extended=src.op2(); | ||
| value_extended, unsignedbv_typet(width)); |
There was a problem hiding this comment.
Style: params all on one line, or one per line
smowton
left a comment
There was a problem hiding this comment.
Thanks for fixing the comments above; done reviewing with only nitpicks regarding the second half.
2af5790 to
afc73ba
Compare
byte_extract lowering of string constants [blocks: #2068]
7383a2f to
96d9f79
Compare
|
This (hopefully!) final part of the byte-operator rework is now ready for review. I didn't see a way to step-by-step fix the byte_update lowering code and instead rewrote it completely. I'm afraid this doesn't make for a particularly pleasant review of the first commit in here. Consequently it may be more helpful to subject this branch to extensive testing. It is perfectly possible that with some of the PRs depending on this one we still run into issues, but then at least the implementation should now be amenable to local fixes, depending on what input data types we are looking at. |
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: 96d9f79).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101248909
96d9f79 to
b05c96c
Compare
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: b05c96c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101272202
| continue; | ||
| } | ||
|
|
||
| if(!non_const_update_bound.has_value()) |
There was a problem hiding this comment.
Note: it's better to avoid late decision making in general
There was a problem hiding this comment.
I have reorganised this code a bit, but I'm not sure whether this is the improvement you are looking for. (I have introduced an exprt update_value.)
| #include "bv_conversion_exceptions.h" | ||
| #include "bv_endianness_map.h" | ||
|
|
||
| bvt boolbvt::convert_byte_update(const byte_update_exprt &expr) |
There was a problem hiding this comment.
the fact that the method can throw an exception should probably be documented
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: 45c612f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101349527
smowton
left a comment
There was a problem hiding this comment.
Some comments for what it's worth, but I'm not particularly confident of having accurately reviewed this because there's a lot of very similar edge-case code here. I'd suggest torturing the code with nasty edge-case byte-update operations (as unit tests I guess, since it's easier to be sure you're giving it the form of expression you expect), rather than relying on manual review here. If you're convinced you've already got good unit test coverage then lgtm.
| static exprt lower_byte_update_array_vector_non_const( | ||
| const byte_update_exprt &src, | ||
| const typet &subtype, | ||
| const array_exprt &value_as_byte_array, |
There was a problem hiding this comment.
PRECONDITION that this really is made of chars?
There was a problem hiding this comment.
Do you feel this would add much value? I can of course do it, but I think things would either not break or break in an obvious way if that ever were violated.
|
@romainbrenguier @smowton I believe I have incorporated all feedback. |
|
Restarted spurious CI failure -- is this good to go with @kroening or @peterschrammel's review? |
Yes, reviews would be much appreciated. |
| is_unbounded_array(expr.op().type()) || | ||
| has_subtype(expr.type(), ID_pointer, ns) || | ||
| has_subtype(expr.op().type(), ID_pointer, ns)) | ||
| { |
There was a problem hiding this comment.
What's the rationale for that, actually?
Is the alternative to always use the flattening code?
There was a problem hiding this comment.
The rationale for lowering byte-operators when pointers are involved? Given that there is another commit that does lowering when encountering a pointer I can check whether that's still necessary (without the latter it would certainly lead to an exception), it's a matter of whether the SMT back-end now needs this or not(edit: this code is not for SMT).
There was a problem hiding this comment.
Very good call, this is now unnecessary. Commit (for both byte extract and byte update) and commit message amended.
… types byte_update lowering now proceeds as follows: 1) Determine the size of the update, with the size of the object to be updated as an upper bound. We fail if neither can be determined. 2) Turn the update value into a byte array of the size determined above. 3) If the offset is not constant, turn the object into a byte array, and use a "with" expression to encode the update; else update the values in place. 4) Construct a new object.
A prior lowering pass may have removed them (for example, if they occur within unbounded arrays).
The array theory does not handle byte_{extract,update} operators. For pointers,
make sure the flattening back-end also uses lowering rather than failing with an
exception.
The locally implemented lowering was very incomplete.
These may be generated when dereferencing invalid pointers, and are just no-ops. Make sure we don't fail as size_of_expr would return nil_exprt() for those.
This removes the constraint on aligned member accesses. Includes factoring out of unpack_struct to avoid growing the size of unpack_rec even further.
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: 4ada4bf).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102380441
|
Let's merge this now -- in the long run, it may be the case that the bit-vector flattener in SMT2 possibly outperforms the lowerer for byte_update. |
allredj
left a comment
There was a problem hiding this comment.
This PR failed Diffblue compatibility checks (cbmc commit: 34a4de9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102381986
Status will be re-evaluated on next push.
Common spurious failures:
-
the cbmc commit has disappeared in the mean time (e.g. in a force-push)
-
the author is not in the list of contributors (e.g. first-time contributors).
-
the compatibility was already broken by an earlier merge.
Depends on #2061 and #2063 and thus should not be merged just yet.