Conversation
hannes-steffenhagen-diffblue
left a comment
There was a problem hiding this comment.
Looks ok to me (with the disclaimer that I don't really understand this)
| bool gdb_value_extractort::memory_scopet::contains( | ||
| const memory_addresst &point) const | ||
| { | ||
| size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); |
There was a problem hiding this comment.
This is a unchecked conversion effectively (size_t is not necessarily the same size as unsigned long, even though it's often the case). I'd recommend using safe_string2size_t from string2int.h (same with all the other string to int conversions here).
There was a problem hiding this comment.
I wasn't sure if the safe-conversion would work on hex-numbers.
There was a problem hiding this comment.
They should, they use stoll which basically works the same way as the str* functions except it's easier to detect failures.
There was a problem hiding this comment.
Done and refactored so that the conversion is computed once in case of distance. See: 3812c28.
src/memory-analyzer/analyze_symbol.h
Outdated
|
|
||
| struct memory_scopet | ||
| { | ||
| memory_addresst begin; |
There was a problem hiding this comment.
Why not private, if these are not arbitrary (as the deleted default constructor implies)?
There was a problem hiding this comment.
Done, also added getters when necessary.
| optionalt<mp_integer> get_malloc_size(irep_idt name); | ||
|
|
||
| /// Build the pointee string for address \p point assuming it points to a | ||
| /// dynamic allocation of `n' elements each of size \p member_size. E.g.: |
There was a problem hiding this comment.
what's n here?
There was a problem hiding this comment.
T* point = (T*) malloc(sizeof(T) * n); .. that n.
| [this](const snapshot_pairt &left, const snapshot_pairt &right) { | ||
| return pointer_depth(left.second.symbol_expr().type()) < | ||
| pointer_depth(right.second.symbol_expr().type()); | ||
| if(refers_to(right.second.value, left.first)) |
There was a problem hiding this comment.
Can you explain what this is for?
There was a problem hiding this comment.
For example, say int* p = malloc(8); int *q = p+1; is the source code for which we want to take the memory snapshot. Then the snapshot will contain: int tmp[2]; int* p = tmp[0]; int* q = tmp[1]; And both p and q refer_to tmp. I sort the symbols based on this notion of reference so that when building the harness entry function, tmp is assigned before the two pointers.
| const std::string pointee; | ||
| const std::string character; | ||
| const optionalt<std::string> string; | ||
| memory_addresst address; |
There was a problem hiding this comment.
These fields are all related, so I don't think they should be both be constant and public at the same time.
There was a problem hiding this comment.
Well they are not const in the final version.
| if(pointer_value.pointee.find("+") != std::string::npos) | ||
| return true; | ||
|
|
||
| if(pointer_value.pointee.empty()) |
There was a problem hiding this comment.
why is this only done if the pointee is empty? What does that mean?
There was a problem hiding this comment.
Well pointer_value is the result of the gdb query get_memory. When the pointee is empty it means that gdb did not know what structure the pointer points to. This can happen if the pointer points to inside a dynamically allocated memory.
| bool gdb_value_extractort::memory_scopet::contains( | ||
| const memory_addresst &point) const | ||
| { | ||
| size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); |
src/memory-analyzer/analyze_symbol.h
Outdated
| memory_scopet() = delete; | ||
| memory_scopet( | ||
| const memory_addresst &begin, | ||
| mp_integer byte_size, |
There was a problem hiding this comment.
const mp_integer &byte_size
src/memory-analyzer/analyze_symbol.h
Outdated
| memory_scopet( | ||
| const memory_addresst &begin, | ||
| mp_integer byte_size, | ||
| irep_idt name) |
| CHECK_RETURN(contains(point)); | ||
| size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); | ||
| size_t point_int = std::strtoul(point.address_string.c_str(), NULL, 0); | ||
| return (point_int - begin_int) / member_size; |
There was a problem hiding this comment.
Shouldn't you first do CHECK_RETURN(point_int >= begin_int);?
There was a problem hiding this comment.
That was implied by contains. But I refactored this section a bit to avoid duplicate computation of the address-to-string conversion. See: 3812c28.
src/memory-analyzer/analyze_symbol.h
Outdated
| mp_integer byte_size; | ||
| irep_idt name; | ||
|
|
||
| memory_scopet() = delete; |
There was a problem hiding this comment.
I'm not sure that's necessary if you explicitly define some other constructor?
There was a problem hiding this comment.
You are right. Removed.
| // temporary object for it: get the type from symbol table, query gdb for | ||
| // value, allocate new object for it and then store into assignments | ||
| if(it == values.end()) | ||
| if(struct_symbol->symbol_expr().type().id() == ID_pointer) |
There was a problem hiding this comment.
This looks very weird?!
There was a problem hiding this comment.
What about struct_symbol->type.id() ? Changed.
| } | ||
|
|
||
| if(!found) | ||
| if(struct_symbol->symbol_expr().type().id() == ID_array) |
| CHECK_RETURN(maybe_struct_size.has_value()); | ||
| for(const auto &value_pair : values) | ||
|
|
||
| if(memory_map.count(struct_name) == 0) |
There was a problem hiding this comment.
Wrapped in a helper function has_known_memory_location.
| @@ -0,0 +1,13 @@ | |||
| FUTURE | |||
There was a problem hiding this comment.
Please explain (after another -- at the end of the file) what isn't supported here so far.
| tmp = \{ \.next=\(\(struct S \*\)0\) \}; | ||
| p = \&tmp; | ||
| st = \{ .next=\&st \}; | ||
| p = \&st; |
There was a problem hiding this comment.
Could you please squash this into the commit that made this possible?
There was a problem hiding this comment.
It was this one: 33d0a79. I will squash them together.
d803707 to
3812c28
Compare
3812c28 to
71ee2f1
Compare
785cf3a to
410fd84
Compare
|
@xbauch My apologies for taking this long! You might want to squash some commits and then just merge. Thank you for all the work!!! |
89f8805 to
5453ba7
Compare
In memory-snapshot-harness: so that pointer to objects are initialised after those objects.
no functional change
So that we can assume the implicit assignment operator.
One for value, one for register-value.
by breaking the execution at malloc entry and extracting the byte size and memory address from registers.
by returning the exact value rather than an estimate collected during preceding execution.
memory_scopet keeps the starting point and allocated size for a malloc(ed) site. We also include helper methods to query about dynamic allocation.
To account for the possibility of the pointee to be dynamically allocated.
Via dynamically_allocated and memory_map.
into a member method.
Unnecessary now that we enforce analysing those objects.
1: If pointee is known, return it's value instead. 2: malloc-size is precise and type-size has a wrapper 3: do not cast values from dynamically allocated pointers: we would loose the information about them being arrays.
1: Use memory map to get pointer-values (but double-check if gdb previously claimed that their pointees are themselves) 2: add more checks before casting.
instead of a vector to prevent duplications.
1: Casts are a part of the output 2: We can use some of the assertions previously left out (now that we get the dynamic allocation right).
Floating point numbers cannot be extracted for now.
rather than as a part of regression testing and only under non-MSVC environment.
2ab16c4 to
ba8beb5
Compare
allredj
left a comment
There was a problem hiding this comment.
✔️
Passed Diffblue compatibility checks (cbmc commit: ba8beb5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/113721650
Enable tracking information about dynamically allocated memory from
memory-analyzeracross togoto-harnessvia memory snapshots. The knowledge about what addresses have been allocated dynamically (and the scope of these allocation in byte size) is collected by breaking the execution atmalloccall-sites and inspecting the registers of these function calls.