Skip to content

Improve performance and fix limitations of flatten_byte_operators [blocks: #3725]#2068

Merged
kroening merged 6 commits intodiffblue:developfrom
tautschnig:flatten-byte-ops
Feb 26, 2019
Merged

Improve performance and fix limitations of flatten_byte_operators [blocks: #3725]#2068
kroening merged 6 commits intodiffblue:developfrom
tautschnig:flatten-byte-ops

Conversation

@tautschnig
Copy link
Collaborator

@tautschnig tautschnig commented Apr 16, 2018

Depends on #2061 and #2063 and thus should not be merged just yet.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed up to byte_operators.cpp:578, to continue tomorrow

#endif

error().source_location = expr.find_source_location();
error() << "member: component type does not match: " << subtype.pretty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend newlines before pretty for readability

error().source_location = expr.find_source_location();
error() << "member: component type does not match: " << subtype.pretty()
<< " vs. " << expr.type().pretty() << eom;
throw 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an invariant, as it's our responsibility to make sure this is true, not a user error


return bv;
#if 0
std::cout << "DEBUG " << expr.pretty() << "\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to commit this, replace DEBUG with a more useful key?

mp_integer num_elements;
if(to_integer(max_bytes, num_elements) && to_integer(src_size, num_elements))
{
throw "cannot unpack array/vector of non-const size:\n" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, invariant?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's something that may be fixable in some way, and really a limitation of the implementation rather than a bug. So I'm inclined to leave it in place.


mp_integer offset_int = 0;

if(element_width > 0 && element_width % 8 == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would element_width == 0 mean?

if(!to_integer(offset, offset_int))
{
// compute offset as number of elements
offset_int /= el_bytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset_elements?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


exprt src_simp = simplify_expr(src, ns);

for(mp_integer i = offset_int; i < num_elements; ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above units trouble, i -> element_idx?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've made a right mess here. Thanks for looking carefully, I'm working on cleanup.


if(src_simp.id() != ID_array || i >= src_simp.operands().size())
{
index = index_exprt(src_simp, from_integer(i, index_type()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index -> element?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

new_offset = from_integer(new_offset_int, offset.type());
}

if(!to_integer(max_bytes, new_max_bytes_int))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment before these what the intended adjustment is? At the moment it looks like you would reduce both lower and upper bounds if the access is not byte-aligned, whereas I might expect you to round the lower bound down and the upper bound up to ensure you include the desired range?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, byte-alignedness is tested above, there shouldn't be a need for rounding I believe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case member_offset_bits % 8 is always zero, as it starts at zero and is the incremental sum of several element_widths, each of which is itself a multiple of eight.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment really highlighted that there was a problem: member_offset_bits wasn't actually being used properly, this should have been a division, not modulo!

UNREACHABLE;

// TODO: consider ways of dealing with vectors of unknown subtype
// size or with a subtype size that does not fit byte boundaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What currently happens?

@tautschnig tautschnig force-pushed the flatten-byte-ops branch 5 times, most recently from 179f937 to 5b8938e Compare April 16, 2018 22:36
has_subtype(expr.op0().type(), ID_pointer, ns))
{
exprt tmp = lower_byte_update(expr, ns);
return convert_bv(tmp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const exprt tmp or just one line return

if(has_byte_operator(expr))
{
exprt tmp=flatten_byte_operators(expr, ns);
exprt tmp = lower_byte_operators(expr, ns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

it++)
for(struct_union_typet::componentst::const_iterator it = components.begin();
it != components.end();
++it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ranged for if it is not used as an iterator inside the loop

offset+=sub_width;
DATA_INVARIANT(
base_type_eq(subtype, expr.type(), ns),
"component type expected to match expression type:\n" + expr.pretty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to avoid calling pretty in places where it will be called even if there is no error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the "called even if there is no error" part - this would only be printed in case of a violated invariant?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes sorry, you can ignore my comment, the macro is so that the second part does not need to be called.

offset + sub_width <= struct_bv.size(), "member access outside struct");

for(std::size_t i = 0; i < sub_width; i++)
bv[i] = struct_bv[offset + i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use bv.insert(bv.begin(), struct_bv + offset, struct_bv + offset + sub_width) instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! This actually enables a further simplification of just doing return bvt(struct_bv.begin() + ...).

@tautschnig tautschnig force-pushed the flatten-byte-ops branch 2 times, most recently from 83a50d6 to 95f06fb Compare April 17, 2018 10:02

mp_integer index = 0;
if(
!offset_i.is_constant() || root.id() != ID_array ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability: suggest break after each disjunct

if(sub_size==1)
{
byte_extract_exprt byte_extract_expr(
src.id()==ID_byte_update_little_endian?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might as well use conventional operator spacing here

src.op2(), t);
else
value_extended=src.op2();
value_extended, unsignedbv_typet(width));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: params all on one line, or one per line

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the comments above; done reviewing with only nitpicks regarding the second half.

@tautschnig tautschnig force-pushed the flatten-byte-ops branch 3 times, most recently from 2af5790 to afc73ba Compare April 17, 2018 23:50
@tautschnig tautschnig removed their assignment Apr 18, 2018
tautschnig added a commit that referenced this pull request Feb 15, 2019
byte_extract lowering of string constants [blocks: #2068]
@tautschnig tautschnig force-pushed the flatten-byte-ops branch 2 times, most recently from 7383a2f to 96d9f79 Compare February 18, 2019 01:11
@tautschnig tautschnig assigned smowton and kroening and unassigned tautschnig Feb 18, 2019
@tautschnig
Copy link
Collaborator Author

This (hopefully!) final part of the byte-operator rework is now ready for review. I didn't see a way to step-by-step fix the byte_update lowering code and instead rewrote it completely. I'm afraid this doesn't make for a particularly pleasant review of the first commit in here. Consequently it may be more helpful to subject this branch to extensive testing. It is perfectly possible that with some of the PRs depending on this one we still run into issues, but then at least the implementation should now be amenable to local fixes, depending on what input data types we are looking at.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 96d9f79).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101248909

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: b05c96c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101272202

continue;
}

if(!non_const_update_bound.has_value())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: it's better to avoid late decision making in general

Copy link
Collaborator Author

@tautschnig tautschnig Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reorganised this code a bit, but I'm not sure whether this is the improvement you are looking for. (I have introduced an exprt update_value.)

#include "bv_conversion_exceptions.h"
#include "bv_endianness_map.h"

bvt boolbvt::convert_byte_update(const byte_update_exprt &expr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fact that the method can throw an exception should probably be documented

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 45c612f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101349527

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments for what it's worth, but I'm not particularly confident of having accurately reviewed this because there's a lot of very similar edge-case code here. I'd suggest torturing the code with nasty edge-case byte-update operations (as unit tests I guess, since it's easier to be sure you're giving it the form of expression you expect), rather than relying on manual review here. If you're convinced you've already got good unit test coverage then lgtm.

static exprt lower_byte_update_array_vector_non_const(
const byte_update_exprt &src,
const typet &subtype,
const array_exprt &value_as_byte_array,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRECONDITION that this really is made of chars?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel this would add much value? I can of course do it, but I think things would either not break or break in an obvious way if that ever were violated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no big deal

@tautschnig
Copy link
Collaborator Author

@romainbrenguier @smowton I believe I have incorporated all feedback.

@smowton
Copy link
Contributor

smowton commented Feb 22, 2019

Restarted spurious CI failure -- is this good to go with @kroening or @peterschrammel's review?

@tautschnig
Copy link
Collaborator Author

is this good to go with @kroening or @peterschrammel's review?

Yes, reviews would be much appreciated.

is_unbounded_array(expr.op().type()) ||
has_subtype(expr.type(), ID_pointer, ns) ||
has_subtype(expr.op().type(), ID_pointer, ns))
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for that, actually?
Is the alternative to always use the flattening code?

Copy link
Collaborator Author

@tautschnig tautschnig Feb 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale for lowering byte-operators when pointers are involved? Given that there is another commit that does lowering when encountering a pointer I can check whether that's still necessary (without the latter it would certainly lead to an exception), it's a matter of whether the SMT back-end now needs this or not(edit: this code is not for SMT).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good call, this is now unnecessary. Commit (for both byte extract and byte update) and commit message amended.

… types

byte_update lowering now proceeds as follows:
1) Determine the size of the update, with the size of the object to be
updated as an upper bound. We fail if neither can be determined.
2) Turn the update value into a byte array of the size determined above.
3) If the offset is not constant, turn the object into a byte array, and
use a "with" expression to encode the update; else update the values in
place.
4) Construct a new object.
A prior lowering pass may have removed them (for example, if they occur within
unbounded arrays).
The array theory does not handle byte_{extract,update} operators. For pointers,
make sure the flattening back-end also uses lowering rather than failing with an
exception.
The locally implemented lowering was very incomplete.
These may be generated when dereferencing invalid pointers, and are just no-ops.
Make sure we don't fail as size_of_expr would return nil_exprt() for those.
This removes the constraint on aligned member accesses. Includes factoring out
of unpack_struct to avoid growing the size of unpack_rec even further.
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 4ada4bf).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102380441

@kroening
Copy link
Collaborator

Let's merge this now -- in the long run, it may be the case that the bit-vector flattener in SMT2 possibly outperforms the lowerer for byte_update.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 34a4de9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102381986
Status will be re-evaluated on next push.
Common spurious failures:

  • the cbmc commit has disappeared in the mean time (e.g. in a force-push)

  • the author is not in the list of contributors (e.g. first-time contributors).

  • the compatibility was already broken by an earlier merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants