Improve performance and fix limitations of flatten_byte_operators [blocks: #3725] by tautschnig · Pull Request #2068 · diffblue/cbmc

tautschnig · 2018-04-16T08:16:12Z

~~Depends on #2061 and #2063 and thus should not be merged just yet.~~

smowton

Reviewed up to byte_operators.cpp:578, to continue tomorrow

smowton · 2018-04-16T15:08:06Z

src/solvers/flattening/boolbv_member.cpp

+#endif
+
+        error().source_location = expr.find_source_location();
+        error() << "member: component type does not match: " << subtype.pretty()


Recommend newlines before pretty for readability

smowton · 2018-04-16T15:08:50Z

src/solvers/flattening/boolbv_member.cpp

+        error().source_location = expr.find_source_location();
+        error() << "member: component type does not match: " << subtype.pretty()
+                << " vs. " << expr.type().pretty() << eom;
+        throw 0;


I think this is an invariant, as it's our responsibility to make sure this is true, not a user error

smowton · 2018-04-16T15:10:12Z

src/solvers/flattening/boolbv_member.cpp

-
-        return bv;
+#if 0
+        std::cout << "DEBUG " << expr.pretty() << "\n";


If we're going to commit this, replace DEBUG with a more useful key?

smowton · 2018-04-16T15:17:04Z

src/solvers/lowering/byte_operators.cpp

+  mp_integer num_elements;
+  if(to_integer(max_bytes, num_elements) && to_integer(src_size, num_elements))
+  {
+    throw "cannot unpack array/vector of non-const size:\n" +


Again, invariant?

It's something that may be fixable in some way, and really a limitation of the implementation rather than a bug. So I'm inclined to leave it in place.

smowton · 2018-04-16T15:20:47Z

src/solvers/lowering/byte_operators.cpp

+
+  mp_integer offset_int = 0;
+
+  if(element_width > 0 && element_width % 8 == 0)


What would element_width == 0 mean?

smowton · 2018-04-16T15:25:38Z

src/solvers/lowering/byte_operators.cpp

+    if(!to_integer(offset, offset_int))
+    {
+      // compute offset as number of elements
+      offset_int /= el_bytes;


offset_elements?

smowton · 2018-04-16T15:26:07Z

src/solvers/lowering/byte_operators.cpp

+
+  exprt src_simp = simplify_expr(src, ns);
+
+  for(mp_integer i = offset_int; i < num_elements; ++i)


Given the above units trouble, i -> element_idx?

I think I've made a right mess here. Thanks for looking carefully, I'm working on cleanup.

smowton · 2018-04-16T15:26:46Z

src/solvers/lowering/byte_operators.cpp

+
+    if(src_simp.id() != ID_array || i >= src_simp.operands().size())
+    {
+      index = index_exprt(src_simp, from_integer(i, index_type()));


index -> element?

smowton · 2018-04-16T15:33:40Z

src/solvers/lowering/byte_operators.cpp

+          new_offset = from_integer(new_offset_int, offset.type());
+      }
+
+      if(!to_integer(max_bytes, new_max_bytes_int))


Comment before these what the intended adjustment is? At the moment it looks like you would reduce both lower and upper bounds if the access is not byte-aligned, whereas I might expect you to round the lower bound down and the upper bound up to ensure you include the desired range?

No, byte-alignedness is tested above, there shouldn't be a need for rounding I believe.

In that case member_offset_bits % 8 is always zero, as it starts at zero and is the incremental sum of several element_widths, each of which is itself a multiple of eight.

Your comment really highlighted that there was a problem: member_offset_bits wasn't actually being used properly, this should have been a division, not modulo!

smowton · 2018-04-16T15:40:43Z

src/solvers/lowering/byte_operators.cpp

+      UNREACHABLE;
+
+    // TODO: consider ways of dealing with vectors of unknown subtype
+    // size or with a subtype size that does not fit byte boundaries


What currently happens?

romainbrenguier · 2018-04-17T09:22:06Z

src/solvers/flattening/boolbv_byte_update.cpp

+    has_subtype(expr.op0().type(), ID_pointer, ns))
+  {
+    exprt tmp = lower_byte_update(expr, ns);
+    return convert_bv(tmp);


const exprt tmp or just one line return

romainbrenguier · 2018-04-17T09:22:31Z

src/solvers/flattening/boolbv_equality.cpp

    if(has_byte_operator(expr))
    {
-      exprt tmp=flatten_byte_operators(expr, ns);
+      exprt tmp = lower_byte_operators(expr, ns);


romainbrenguier · 2018-04-17T09:24:00Z

src/solvers/flattening/boolbv_member.cpp

-        it++)
+  for(struct_union_typet::componentst::const_iterator it = components.begin();
+      it != components.end();
+      ++it)


Use ranged for if it is not used as an iterator inside the loop

romainbrenguier · 2018-04-17T09:26:28Z

src/solvers/flattening/boolbv_member.cpp

-      offset+=sub_width;
+      DATA_INVARIANT(
+        base_type_eq(subtype, expr.type(), ns),
+        "component type expected to match expression type:\n" + expr.pretty());


I think it's better to avoid calling pretty in places where it will be called even if there is no error.

I'm not sure I understand the "called even if there is no error" part - this would only be printed in case of a violated invariant?!

oh yes sorry, you can ignore my comment, the macro is so that the second part does not need to be called.

romainbrenguier · 2018-04-17T09:29:37Z

src/solvers/flattening/boolbv_member.cpp

+        offset + sub_width <= struct_bv.size(), "member access outside struct");
+
+      for(std::size_t i = 0; i < sub_width; i++)
+        bv[i] = struct_bv[offset + i];


I think you could use bv.insert(bv.begin(), struct_bv + offset, struct_bv + offset + sub_width) instead

Good call! This actually enables a further simplification of just doing return bvt(struct_bv.begin() + ...).

smowton · 2018-04-17T13:55:01Z

src/solvers/lowering/byte_operators.cpp

+
+    mp_integer index = 0;
+    if(
+      !offset_i.is_constant() || root.id() != ID_array ||


Readability: suggest break after each disjunct

smowton · 2018-04-17T13:55:46Z

src/solvers/lowering/byte_operators.cpp

      if(sub_size==1)
      {
+        byte_extract_exprt byte_extract_expr(
+          src.id()==ID_byte_update_little_endian?


might as well use conventional operator spacing here

smowton · 2018-04-17T14:04:08Z

src/solvers/lowering/byte_operators.cpp

-          src.op2(), t);
-    else
-      value_extended=src.op2();
+          value_extended, unsignedbv_typet(width));


Style: params all on one line, or one per line

smowton

Thanks for fixing the comments above; done reviewing with only nitpicks regarding the second half.

byte_extract lowering of string constants [blocks: #2068]

tautschnig · 2019-02-18T01:17:50Z

This (hopefully!) final part of the byte-operator rework is now ready for review. I didn't see a way to step-by-step fix the byte_update lowering code and instead rewrote it completely. I'm afraid this doesn't make for a particularly pleasant review of the first commit in here. Consequently it may be more helpful to subject this branch to extensive testing. It is perfectly possible that with some of the PRs depending on this one we still run into issues, but then at least the implementation should now be amenable to local fixes, depending on what input data types we are looking at.

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: 96d9f79).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101248909

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: b05c96c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101272202

src/solvers/lowering/byte_operators.cpp

romainbrenguier · 2019-02-18T09:27:54Z

src/solvers/lowering/byte_operators.cpp

+        continue;
+    }
+
+    if(!non_const_update_bound.has_value())


Note: it's better to avoid late decision making in general

I have reorganised this code a bit, but I'm not sure whether this is the improvement you are looking for. (I have introduced an exprt update_value.)

src/solvers/lowering/byte_operators.cpp

romainbrenguier · 2019-02-18T10:14:55Z

src/solvers/flattening/boolbv_byte_update.cpp

+#include "bv_conversion_exceptions.h"
 #include "bv_endianness_map.h"

 bvt boolbvt::convert_byte_update(const byte_update_exprt &expr)


the fact that the method can throw an exception should probably be documented

src/solvers/lowering/expr_lowering.h

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: 45c612f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/101349527

smowton

Some comments for what it's worth, but I'm not particularly confident of having accurately reviewed this because there's a lot of very similar edge-case code here. I'd suggest torturing the code with nasty edge-case byte-update operations (as unit tests I guess, since it's easier to be sure you're giving it the form of expression you expect), rather than relying on manual review here. If you're convinced you've already got good unit test coverage then lgtm.

src/solvers/lowering/byte_operators.cpp

smowton · 2019-02-18T18:39:27Z

src/solvers/lowering/byte_operators.cpp

+static exprt lower_byte_update_array_vector_non_const(
+  const byte_update_exprt &src,
+  const typet &subtype,
+  const array_exprt &value_as_byte_array,


PRECONDITION that this really is made of chars?

Do you feel this would add much value? I can of course do it, but I think things would either not break or break in an obvious way if that ever were violated.

Sure, no big deal

src/solvers/lowering/byte_operators.cpp

tautschnig · 2019-02-19T20:53:14Z

@romainbrenguier @smowton I believe I have incorporated all feedback.

smowton · 2019-02-22T14:17:49Z

Restarted spurious CI failure -- is this good to go with @kroening or @peterschrammel's review?

tautschnig · 2019-02-22T14:38:40Z

is this good to go with @kroening or @peterschrammel's review?

Yes, reviews would be much appreciated.

kroening · 2019-02-26T16:19:08Z

src/solvers/flattening/boolbv_byte_extract.cpp

+    is_unbounded_array(expr.op().type()) ||
+    has_subtype(expr.type(), ID_pointer, ns) ||
+    has_subtype(expr.op().type(), ID_pointer, ns))
  {


What's the rationale for that, actually?
Is the alternative to always use the flattening code?

The rationale for lowering byte-operators when pointers are involved? Given that there is another commit that does lowering when encountering a pointer I can check whether that's still necessary (without the latter it would certainly lead to an exception)~~, it's a matter of whether the SMT back-end now needs this or not~~(edit: this code is not for SMT).

Very good call, this is now unnecessary. Commit (for both byte extract and byte update) and commit message amended.

… types byte_update lowering now proceeds as follows: 1) Determine the size of the update, with the size of the object to be updated as an upper bound. We fail if neither can be determined. 2) Turn the update value into a byte array of the size determined above. 3) If the offset is not constant, turn the object into a byte array, and use a "with" expression to encode the update; else update the values in place. 4) Construct a new object.

A prior lowering pass may have removed them (for example, if they occur within unbounded arrays).

The array theory does not handle byte_{extract,update} operators. For pointers, make sure the flattening back-end also uses lowering rather than failing with an exception.

The locally implemented lowering was very incomplete.

These may be generated when dereferencing invalid pointers, and are just no-ops. Make sure we don't fail as size_of_expr would return nil_exprt() for those.

This removes the constraint on aligned member accesses. Includes factoring out of unpack_struct to avoid growing the size of unpack_rec even further.

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: 4ada4bf).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102380441

kroening · 2019-02-26T19:08:25Z

Let's merge this now -- in the long run, it may be the case that the bit-vector flattener in SMT2 possibly outperforms the lowerer for byte_update.

allredj

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 34a4de9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102381986
Status will be re-evaluated on next push.
Common spurious failures:

the cbmc commit has disappeared in the mean time (e.g. in a force-push)
the author is not in the list of contributors (e.g. first-time contributors).
the compatibility was already broken by an earlier merge.

tautschnig added the do not merge label Apr 16, 2018

tautschnig self-assigned this Apr 16, 2018

tautschnig requested review from chrisr-diffblue, kroening, martin-cs, peterschrammel, romainbrenguier and smowton as code owners April 16, 2018 08:16

tautschnig force-pushed the flatten-byte-ops branch 3 times, most recently from 04bc573 to a1a41af Compare April 16, 2018 12:50

smowton reviewed Apr 16, 2018

View reviewed changes

tautschnig force-pushed the flatten-byte-ops branch 5 times, most recently from 179f937 to 5b8938e Compare April 16, 2018 22:36

romainbrenguier reviewed Apr 17, 2018

View reviewed changes

tautschnig force-pushed the flatten-byte-ops branch 2 times, most recently from 83a50d6 to 95f06fb Compare April 17, 2018 10:02

smowton reviewed Apr 17, 2018

View reviewed changes

smowton approved these changes Apr 17, 2018

View reviewed changes

tautschnig force-pushed the flatten-byte-ops branch 3 times, most recently from 2af5790 to afc73ba Compare April 17, 2018 23:50

tautschnig removed their assignment Apr 18, 2018

tautschnig added a commit that referenced this pull request Feb 15, 2019

Merge pull request #4186 from tautschnig/byte-op-strings

99aabb4

byte_extract lowering of string constants [blocks: #2068]

tautschnig force-pushed the flatten-byte-ops branch 2 times, most recently from 7383a2f to 96d9f79 Compare February 18, 2019 01:11

tautschnig added the bugfix label Feb 18, 2019

tautschnig assigned smowton and kroening and unassigned tautschnig Feb 18, 2019

allredj reviewed Feb 18, 2019

View reviewed changes

tautschnig force-pushed the flatten-byte-ops branch from 96d9f79 to b05c96c Compare February 18, 2019 07:58

allredj reviewed Feb 18, 2019

View reviewed changes

romainbrenguier reviewed Feb 18, 2019

View reviewed changes

allredj reviewed Feb 18, 2019

View reviewed changes

smowton reviewed Feb 18, 2019

View reviewed changes

romainbrenguier approved these changes Feb 20, 2019

View reviewed changes

tautschnig mentioned this pull request Feb 21, 2019

Rearchitect var args support (making it actually work) [blocks: #4329] #4238

Merged

3 tasks

kroening reviewed Feb 26, 2019

View reviewed changes

tautschnig added 6 commits February 26, 2019 16:57

Do not introduce byte_extract_exprt during flattening

3c9900c

A prior lowering pass may have removed them (for example, if they occur within unbounded arrays).

Lower byte_{extract,update} over pointers and unbounded arrays

6ab3311

The array theory does not handle byte_{extract,update} operators. For pointers, make sure the flattening back-end also uses lowering rather than failing with an exception.

Use byte_update lowering in SMT2 conversion

f906b43

The locally implemented lowering was very incomplete.

Byte-operator lowering: support void-typed updates

ce49794

These may be generated when dereferencing invalid pointers, and are just no-ops. Make sure we don't fail as size_of_expr would return nil_exprt() for those.

Byte-operator lowering: support structs containing bit-fields

4ada4bf

This removes the constraint on aligned member accesses. Includes factoring out of unpack_struct to avoid growing the size of unpack_rec even further.

allredj reviewed Feb 26, 2019

View reviewed changes

kroening approved these changes Feb 26, 2019

View reviewed changes

allredj reviewed Feb 26, 2019

View reviewed changes


		mp_integer offset_int = 0;

		if(element_width > 0 && element_width % 8 == 0)


		exprt src_simp = simplify_expr(src, ns);

		for(mp_integer i = offset_int; i < num_elements; ++i)

Conversation

tautschnig commented Apr 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smowton left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smowton left a comment

Choose a reason for hiding this comment

Uh oh!

tautschnig commented Feb 18, 2019

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tautschnig commented Apr 16, 2018 •

edited

Loading

tautschnig Feb 19, 2019 •

edited

Loading