-
Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: incorrect string type conversion #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cast between semantically different string types: char* from/to wchar_t* NOTE: Please let me know if you want to use a different CWE than CWE-704
|
I ran this on both Linux and Windows builds of ChakraCore. It finds 0 results on the Linux build and 59 on the Windows build, so I suspect we'll want changes so that it produces results on ChakraCore on LGTM.com. @geoffw0 I believe you're the most familiar with how string types are handled in ChakraCore on Linux, so I've assigned this PR to you. |
|
Thanks. |
|
Unfortunately, the ChakraCore build on LGTM.com is currently broken. There should be a fix for it in the next couple days, but I can send you the snapshot I've been testing with in the meantime. |
|
That will work. Thanks a lot. |
|
I tried building a snapshot for ChakraCore (https://github.com/microsoft/ChakraCore) for Windows, and I haven’t been able to find any instance of code that matches this new rule on it, nor on the Linux Snapshot that Robert shared with me. |
|
I ran this query on 43 projects and was surprised that it found results on only two projects. On the Suspicious pointer scaling query we've had to assume that all casts to a char type are fine because people tend to convert to The results of this query also fall in this category. In fish-shell, the char pointers are used to copy wide chars into a block of memory returned from The |
| @@ -0,0 +1,3 @@ | |||
| LPWSTR pSrc; | |||
|
|
|||
| pSrc = (LPWSTR)"a"; No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please state this example in terms of wchar_t, which is a standard type.
| <qhelp> | ||
|
|
||
| <overview> | ||
| <p>This rule indicates a potentially incorrect cast from/to an ANSI string (<code>char *</code>) to/from a Unicode string (<code>wchar_t *</code>).</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's accurate to say that char is ANSI and wchar_t is Unicode. That's what they're called in Windows APIs, but other platforms use char for both 7-bit ASCII, 8-bit legacy charsets, and UTF-8 Unicode.
I suggest replacing "ANSI string" with "byte string" and "Unicode string" with "wide-character string". Also below.
| </overview> | ||
|
|
||
| <recommendation> | ||
| <p>Do not explicitly casting ANSI strings to/from Unicode strings.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"casting" -> "cast"
| </overview> | ||
|
|
||
| <recommendation> | ||
| <p>Do not explicitly casting ANSI strings to/from Unicode strings.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should give a recommendation that can make the alert go away and keep the code working. What is the typical remedy for this sort of error? For string literals we can recommend prepending L. For other cases, should we recommend calling an appropriate (platform-dependent) conversion function?
| </recommendation> | ||
|
|
||
| <example> | ||
| <p>In the following example, an ANSI string literal (<code>"a"</code>) is casted as a Unicode string.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"casted as" -> "cast to"
|
Thanks a lot. I will work on all your suggestions and submit them ASAP. |
| @@ -0,0 +1,29 @@ | |||
| /** | |||
| * @name Cast between semantically different string types: char* from/to wchar_t* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like a shorter @name for consistency with our existing queries. I propose "Cast from char* to wchar_t*".
| @@ -0,0 +1,29 @@ | |||
| /** | |||
| * @name Cast between semantically different string types: char* from/to wchar_t* | |||
| * @description This rule indicates a potentially incorrect cast from/to an ANSI string (char *) to/from a Unicode string (wchar_t *). | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @description is out of date with the recent changes. For consistency with our other queries, I suggest rephrasing it as follows: "Casting a byte string to a wide-character string is likely to yield a string that is incorrectly terminated or aligned. This can lead to undefined behavior, including buffer overruns."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes ready.
jbj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the query and the fixes. I'll merge this now, and I look forward to seeing the results on lgtm.com. The next deployment should happen in a week or two.
|
Sorry I didn't get around to looking at this last week. The new query looks good, and we really appreciate the tests and qhelp alongside. |
Enable dependabot on the Rust projects
Extract SAM lambda conversion
PS: Add simple type-based sanitizer to SQL injection query
Cast between semantically different string types: char* from/to wchar_t*
NOTE: Please let me know if you want to use a different CWE than CWE-704