-
Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: new query for futile arguments to C functions #790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: new query for futile arguments to C functions #790
Conversation
|
Is the sample code really a good sample? As long as we don't see the definition of the |
|
As I see it, three things can go wrong around calls to
Is there any justification for declaring a C function with I think the query is right not to give an alert for case 3. It's wrong and dangerous, but it's such a common mistake that the more interesting results would drown in the noise. I think it's right to have an alert for case 2, but then the wording of the qhelp and query must change to accomodate it. They currently don't explain very well how |
|
@hmakholm points out that there's also a fourth case, which is the worst of them all and is not caught by this query:
I'm not certain of this, but there might also be a fifth case:
|
| @@ -0,0 +1,3 @@ | |||
| | test.c:7:3:7:5 | call to foo | This call has arguments, but $@ is not declared with any parameters. | test.c:1:6:1:8 | foo | foo | | |||
| | test.c:13:3:13:19 | call to not_yet_declared1 | This call has arguments, but $@ is not declared with any parameters. | test.c:13:3:13:3 | not_yet_declared1 | not_yet_declared1 | | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could (now or in a follow-up PR) modify Function.getLocation() so that an implicit declaration is not used as the location to show for the function itself if there is an alternative non-implicit declaration, as in this case?
The query does not currently give an alert in this case, because the
I don't think so, but it's a very common practice in the wild.
There's also a 6th case, where there are as many arguments as parameters but there's a type mismatch that the compiler won't catch. Probably very rare but still possible; I suspect there are far more functions that are |
jbj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from a few comments, this PR LGTM. @rdmarsh2, please request review from @semmledocs-ac when the docs are done. A change note will also be needed.
| not_yet_declared1(1); // BAD | ||
| not_yet_declared2(1); // GOOD | ||
|
|
||
| declared_empty_defined_with(); // BAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mark as // BAD [NOT DETECTED]
|
|
||
| int x; | ||
| declared_empty_defined_with(&x); // BAD | ||
| declared_empty_defined_with(x, x); // BAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mark both of these as // BAD [NOT DETECTED]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done; I think these (and the above instance) are best covered by a separate query.
| * @description A call to a function declared without parameters has arguments, which may indicate | ||
| * that the code does not follow the author's intent. | ||
| * @kind problem | ||
| * @problem.severity warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the results you've shown so far, I think we can make it at least @precision high.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to @precision, add @id and @tags attributes?
|
|
||
| <overview> | ||
| <p>A function is called with arguments despite having an empty parameter list. This may indicate | ||
| that the incorrect function is being called, or that the author misunderstood the function.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add text to explain the difference between () in C, (void) in C/C++, and () in C++.
|
Looks good so far. @rdmarsh2, please add precision and change note. @semmledocs-ac, please review the docs. |
|
|
||
| <p>In C, a function declared with an empty parameter list `()` is considered to have an unknown | ||
| parameter list, and therefore can be called with any set of arguments. To declare a function | ||
| which takes no arguments, you must use `(void)` as the parameter list in any forward declarations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by "forward" here.
Is "in any forward declarations" necessary in this sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; a forward declaration is a declaration that is not a definition (for functions, this means the declaration does not contain a function body)
| <p>In C, a function declared with an empty parameter list `()` is considered to have an unknown | ||
| parameter list, and therefore can be called with any set of arguments. To declare a function | ||
| which takes no arguments, you must use `(void)` as the parameter list in any forward declarations. | ||
| In C++, either style of declaration will be considered to take no arguments.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"will be considered to take no arguments."
change to:
"indicates that the function accepts no arguments."
| * @description A call to a function declared without parameters has arguments, which may indicate | ||
| * that the code does not follow the author's intent. | ||
| * @kind problem | ||
| * @problem.severity warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to @precision, add @id and @tags attributes?
|
Add a reference to this query in one of the suite files - e.g. https://github.com/Semmle/ql/blob/master/cpp/config/suites/cpp/correctness ? |
In theory this query will produce no results on C++ code; in practice, I suspect the "cpp" suite is often run on code compiled as C, so it is likely to be worth running anyways.
|
LGTM |
|
I'm new to the thread and am trying to digest the discussion thus far. It would seem to me that the first order of business would be to work out a precise definition of when the diagnostic should be raised. So let me propose mine: "A call to a What I'm unsure of (since I'm new to all this) is the scope of the search for the definition of said function. Should it be confined to the same translation unit? To the entire project? Also, this should apply only to C, never C++. There is another case: we do find a definition of the function, but its parameters/return type is incompatible with our arguments. This would be the subject of a different QL query with different wording ("We were able to locate the definition of F, but its parameters do not match your argument list"...) |
|
I just tested to see whether such a query would overlap with compiler warnings, and I was surprised to find no compiler warnings even when the argument mismatch is within the same file. Here's an example: #include <stdio.h>
int foo();
int main() {
printf("%d\n", foo());
return 0;
}
int foo(int x) {
return x;
}This program produces no warnings with Clang or GCC even when passing When it's that easy to pass too few parameters, I'd expect good results from a query that looks for this mistake. |
|
@zlaski Your definition of what the analysis should do looks good to me in an ideal setting, but I think it could be difficult to get the "same types of parameters" part right. It can be fairly subtle which types are compatible on which platforms. Developers might argue that it's perfectly fine to cast their I hope the extractor will think of the three uses of the |
|
@jbj, I agree about leaving the types for later. I'll also leave the issue of multiple definitions; another query should (and maybe already does) address this. |
|
The c-extractor is behaving a bit strangely -- to me, anyway from FunctionCall fc , Function f where fc.getTarget() = f select fc, f it produces the following .expected output: | test.c:7:3:7:16 | call to declared_empty | test.c:1:6:1:19 | declared_empty | | test.c:8:3:8:16 | call to declared_empty | test.c:1:6:1:19 | declared_empty | | test.c:9:3:9:15 | call to declared_void | test.c:2:6:2:18 | declared_void | | test.c:10:3:10:15 | call to declared_with | test.c:3:6:3:18 | declared_with | | test.c:12:3:12:12 | call to undeclared | test.c:12:3:12:3 | undeclared | | test.c:14:3:14:19 | call to not_yet_declared1 | test.c:14:3:14:3 | not_yet_declared1 | | test.c:14:3:14:19 | call to not_yet_declared1 | test.c:25:6:25:22 | not_yet_declared1 | | test.c:15:3:15:19 | call to not_yet_declared2 | test.c:15:3:15:3 | not_yet_declared2 | | test.c:15:3:15:19 | call to not_yet_declared2 | test.c:26:6:26:22 | not_yet_declared2 | | test.c:17:3:17:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | | test.c:18:3:18:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | | test.c:21:3:21:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | | test.c:22:3:22:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | | test.cpp:5:3:5:13 | call to cpp_varargs | test.cpp:1:6:1:16 | cpp_varargs | | test.cpp:6:3:6:13 | call to cpp_varargs | test.cpp:1:6:1:16 | cpp_varargs | And so some I've been trying to cobble together FutileParams.ql that would correctly handle test.c, thus far unsuccessfully... |
|
We can't tell from that test output whether it's the call to |
|
I ran the query from FunctionCall fc, Function f where fc.getTarget() = f select fc, f, strictcount(fc.getTarget()), strictcount(f.getLocation()) and got | test.c:7:3:7:16 | call to declared_empty | test.c:1:6:1:19 | declared_empty | 1 | 1 | | test.c:8:3:8:16 | call to declared_empty | test.c:1:6:1:19 | declared_empty | 1 | 1 | | test.c:9:3:9:15 | call to declared_void | test.c:2:6:2:18 | declared_void | 1 | 1 | | test.c:10:3:10:15 | call to declared_with | test.c:3:6:3:18 | declared_with | 1 | 1 | | test.c:12:3:12:12 | call to undeclared | test.c:12:3:12:3 | undeclared | 1 | 1 | | test.c:14:3:14:19 | call to not_yet_declared1 | test.c:14:3:14:3 | not_yet_declared1 | 1 | 2 | | test.c:14:3:14:19 | call to not_yet_declared1 | test.c:25:6:25:22 | not_yet_declared1 | 1 | 2 | | test.c:15:3:15:19 | call to not_yet_declared2 | test.c:15:3:15:3 | not_yet_declared2 | 1 | 2 | | test.c:15:3:15:19 | call to not_yet_declared2 | test.c:26:6:26:22 | not_yet_declared2 | 1 | 2 | | test.c:17:3:17:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | 1 | 1 | | test.c:18:3:18:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | 1 | 1 | | test.c:21:3:21:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | 1 | 1 | | test.c:22:3:22:29 | call to declared_empty_defined_with | test.c:27:6:27:32 | declared_empty_defined_with | 1 | 1 | | test.cpp:5:3:5:13 | call to cpp_varargs | test.cpp:1:6:1:16 | cpp_varargs | 1 | 1 | | test.cpp:6:3:6:13 | call to cpp_varargs | test.cpp:1:6:1:16 | cpp_varargs | 1 | 1 | and so it would indeed appear that we are dealing with unique |
|
It's not necessarily an issue that there are multiple locations. They should get merged into single alerts on LGTM even though they look funny in the qltest output. You can test your prototype queries in the LGTM query console to see roughly how alerts will look. If you find a need to exclude some of the locations, you can try selecting |
|
Should we treat undeclared functions ( |
|
@rdmarsh2, I have the QL query (along with updates to qhelp, test and test results) sitting on the zlaski-semmle:cpp340 branch. Is there a way for those to get pulled into the current PR, or should I create a new PR? |
|
It sounds right to treat undeclared functions the same as This PR (#790) is merged, so please make a new PR for your new query. |
|
New PR is #1136. Please disregard the closed PR above. |
This query addresses a corner-case of the C standard: functions declared in header files with empty parameter lists are considered to have an unknown parameter list, not an empty one. This means that arguments can be provided in calls that only have the declaration in scope, but will be ignored by the callee.
An LGTM test run is here. The one result looks correct.