Skip to content

Adding preprocessing for functions of the Java Character library#640

Merged
kroening merged 2 commits intodiffblue:masterfrom
romainbrenguier:java-character-library-support
May 21, 2017
Merged

Adding preprocessing for functions of the Java Character library#640
kroening merged 2 commits intodiffblue:masterfrom
romainbrenguier:java-character-library-support

Conversation

@romainbrenguier
Copy link
Contributor

This preprocessing is meant to replace simple methods of the Character
Java in goto-programs by expressions on characters.
About 60% of the library is supported, but for some of this
methods are limited to ASCII characters.
This is meant to be run in conjunction with the string refinement so
it is called if the string-refine option is activated.
Note that the string-refine option is added in PR #429, but this PR can be merged independently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No single-character indent, please (just don't indent public/private).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No whitespace after &

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a list by value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blank line at end-of-file, please.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments around those magic numbers would really be appreciated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather random as it applies across this PR: use (const) references wherever possible, both in parameters and within the functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put each cpp file on a separate line to facilitate merging

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this has not been implemented yet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignment (several occurrences)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please comment on why this function is not implemented (several instances)

@romainbrenguier romainbrenguier force-pushed the java-character-library-support branch from e0cd4ee to b5cd96b Compare March 18, 2017 20:54
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor changes - many about the use of references.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kroening @peterschrammel @romainbrenguier I am wondering whether the invoked code should go in java_bytecode. (Yes, I have asked a similar question before.) If that folder isn't a good fit (for it introduces a dependency on goto-programs from java_bytecode), then maybe it's time for some folder like java_in-processing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right it should probably go to java_bytecode, the dependencies in goto-programs are not strong here. I will investigate that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes to the code which allows it to be called during java bytecode conversion.
So it can be moved there. See the commits 5801fbf , 848983b and 4c1986a

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const typet &type=result.type(); If you delay the target->make_assignment() call until after code is constructed, you could even make all of the above const exprt &

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass mp_integer by (const) reference

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto &i : list

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use disjunction from util/std_expr.h to avoid a deeply nested expression.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use references

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use references - see above

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this return; statement

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to catch or check additional function names (think about evolution of the Java SDK)? I'm thinking along the lines of "if the function call has prefix java::java.lang.Character." then it must be in the conversion_table or in some (to be created) conversion_not_supported table. Would that seem reasonable to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to detect functions that we do not support, but I don't think it should stop CBMC if we encounter one, considering its result non-deterministic maybe enough for what we want to check. We could raise a warning in that case, but it's actually already done by CBMC: we get messages like **** WARNING: no body for function java::java.lang.Character.toUpperCase:(C)C for the functions we haven't translated.
The reasonable thing in my opinion would be to update the code with the new versions of the JDK. From what I have seen, there will not be much difference with the JDK9 (the only difference I see is a new codePointOf(String) function which is the inverse of getName).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some mixed indent of parameters - either seems ok, but consistency would be great.

@tautschnig
Copy link
Collaborator

With regard to the unsupported functions: won't the warning about missing function bodies continue to show up for all functions that are natively handled by the string solver?

@romainbrenguier romainbrenguier force-pushed the java-character-library-support branch 2 times, most recently from 636c274 to 848983b Compare March 21, 2017 15:06
@romainbrenguier
Copy link
Contributor Author

@tautschnig the functions that we support are converted to some fake functions calls (which are understood by the solver), and these don't show up in the warnings.
So we only see the non-supported functions in the warnings.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: with C++11 you don't need the space in > > anymore.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all that I can tell from just reading code: this looks really good. Thanks a lot for all the changes and refactoring!

@kroening
Copy link
Collaborator

This needs a rebase!

This preprocessing is meant to replace simple methods of the Character
Java library by expressions on characters during bytecode conversion.
About 60% of the library is supported, but for some of this
methods are limited to ASCII characters.
This is meant to be run in conjunction with the string refinement so
it is called if the string-refine option is activated.
@romainbrenguier romainbrenguier force-pushed the java-character-library-support branch from 848983b to 924e01b Compare May 21, 2017 09:31
@romainbrenguier
Copy link
Contributor Author

@kroening this has been rebased.

@kroening kroening merged commit 9ec55a3 into diffblue:master May 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants