C++: incorrect string type conversion #264

raulgarciamsft · 2018-10-01T17:28:07Z

Cast between semantically different string types: char* from/to wchar_t*
NOTE: Please let me know if you want to use a different CWE than CWE-704

Cast between semantically different string types: char* from/to wchar_t* NOTE: Please let me know if you want to use a different CWE than CWE-704

rdmarsh2 · 2018-10-01T17:56:31Z

I ran this on both Linux and Windows builds of ChakraCore. It finds 0 results on the Linux build and 59 on the Windows build, so I suspect we'll want changes so that it produces results on ChakraCore on LGTM.com.

@geoffw0 I believe you're the most familiar with how string types are handled in ChakraCore on Linux, so I've assigned this PR to you.

raulgarciamsft · 2018-10-01T18:02:21Z

Thanks.
Please let me know if there are any instructions I should follow to access ChakraCode for Linux on lgtm (feel free to send the instructions to me privately via email).

rdmarsh2 · 2018-10-01T18:09:31Z

Unfortunately, the ChakraCore build on LGTM.com is currently broken. There should be a fix for it in the next couple days, but I can send you the snapshot I've been testing with in the meantime.

raulgarciamsft · 2018-10-01T18:11:47Z

That will work. Thanks a lot.

raulgarciamsft · 2018-10-01T22:21:20Z

I tried building a snapshot for ChakraCore (https://github.com/microsoft/ChakraCore) for Windows, and I haven’t been able to find any instance of code that matches this new rule on it, nor on the Linux Snapshot that Robert shared with me.
Do you have any hints on what files/lines should I look at?
Thanks

jbj · 2018-10-02T10:07:15Z

I ran this query on 43 projects and was surprised that it found results on only two projects. On the Suspicious pointer scaling query we've had to assume that all casts to a char type are fine because people tend to convert to char * and then do a memcpy-like operation.

The results of this query also fall in this category. In fish-shell, the char pointers are used to copy wide chars into a block of memory returned from malloc where data of various types is manually laid out. In pwsafe, such a cast is used to take the SHA1 hash of a wide-char string. It seems reasonable to me that a SHA1 function takes a char * argument, and it also seems reasonable to want to call that function with an array of elements where there is no padding.

The @description and qhelp suggest that the primary use case for this query is to catch conversions from char * to wchar_t *. What do you say to ignoring conversions that go the other way? I ran that version of the query and found that it gave fewer and better results. The results on pwsafe look dodgy: I can't see how the memory allocated to an unsigned char[] will be suitably aligned for wchar_t *. Maybe the qhelp should specifically mention alignment.

jbj · 2018-10-02T11:00:44Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.cpp

@@ -0,0 +1,3 @@
+LPWSTR pSrc;
+
+pSrc = (LPWSTR)"a";


Please state this example in terms of wchar_t, which is a standard type.

jbj · 2018-10-02T11:11:04Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.qhelp

+<qhelp>
+
+<overview>
+  <p>This rule indicates a potentially incorrect cast from/to an ANSI string (<code>char *</code>) to/from a Unicode string (<code>wchar_t *</code>).</p>


I don't think it's accurate to say that char is ANSI and wchar_t is Unicode. That's what they're called in Windows APIs, but other platforms use char for both 7-bit ASCII, 8-bit legacy charsets, and UTF-8 Unicode.

I suggest replacing "ANSI string" with "byte string" and "Unicode string" with "wide-character string". Also below.

jbj · 2018-10-02T11:17:26Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.qhelp

+</overview>
+
+<recommendation>
+  <p>Do not explicitly casting ANSI strings to/from Unicode strings.</p>


"casting" -> "cast"

jbj · 2018-10-02T11:27:50Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.qhelp

+</overview>
+
+<recommendation>
+  <p>Do not explicitly casting ANSI strings to/from Unicode strings.</p>


Ideally we should give a recommendation that can make the alert go away and keep the code working. What is the typical remedy for this sort of error? For string literals we can recommend prepending L. For other cases, should we recommend calling an appropriate (platform-dependent) conversion function?

jbj · 2018-10-02T11:28:28Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.qhelp

+</recommendation>
+
+<example>
+<p>In the following example, an ANSI string literal (<code>"a"</code>) is casted as a Unicode string.</p>


"casted as" -> "cast to"

raulgarciamsft · 2018-10-02T17:12:25Z

Thanks a lot. I will work on all your suggestions and submit them ASAP.

jbj · 2018-10-03T06:32:31Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.ql

@@ -0,0 +1,29 @@
+/**
+ * @name Cast between semantically different string types: char* from/to wchar_t*


I'd like a shorter @name for consistency with our existing queries. I propose "Cast from char* to wchar_t*".

jbj · 2018-10-03T06:58:00Z

cpp/ql/src/Security/CWE/CWE-704/WcharCharConversion.ql

@@ -0,0 +1,29 @@
+/**
+ * @name Cast between semantically different string types: char* from/to wchar_t*
+ * @description This rule indicates a potentially incorrect cast from/to an ANSI string (char *) to/from a Unicode string (wchar_t *). 


The @description is out of date with the recent changes. For consistency with our other queries, I suggest rephrasing it as follows: "Casting a byte string to a wide-character string is likely to yield a string that is incorrectly terminated or aligned. This can lead to undefined behavior, including buffer overruns."

Fixes ready.

jbj

Thanks for the query and the fixes. I'll merge this now, and I look forward to seeing the results on lgtm.com. The next deployment should happen in a week or two.

geoffw0 · 2018-10-08T11:14:38Z

Sorry I didn't get around to looking at this last week. The new query looks good, and we really appreciate the tests and qhelp alongside.

Enable dependabot on the Rust projects

Extract SAM lambda conversion

PS: Add simple type-based sanitizer to SQL injection query

raulgarciamsft and others added 2 commits October 1, 2018 10:25

C++ : cpp/incorrect-string-type-conversion

253b8d1

Cast between semantically different string types: char* from/to wchar_t* NOTE: Please let me know if you want to use a different CWE than CWE-704

Merge branch 'master' into users/raulga/c6276

99e6708

rdmarsh2 assigned geoffw0 Oct 1, 2018

jbj reviewed Oct 2, 2018

View reviewed changes

raulgarciamsft added 2 commits October 2, 2018 11:17

Updates based on feedback

230724c

Merge operation

492b511

jbj reviewed Oct 3, 2018

View reviewed changes

Chnaging the @name & @description.

3873cbd

jbj approved these changes Oct 4, 2018

View reviewed changes

jbj merged commit 4720c5a into github:master Oct 4, 2018

geoffw0 added a commit to geoffw0/ql that referenced this pull request Oct 8, 2018

CPP: Change note for github#264.

4fb6611

geoffw0 mentioned this pull request Oct 8, 2018

CPP: Additional change notes. #289

Merged

raulgarciamsft deleted the users/raulga/c6276 branch October 4, 2019 16:58

aibaars pushed a commit that referenced this pull request Oct 14, 2021

Merge pull request #264 from github/hmac-dependabot

4cbd848

Enable dependabot on the Rust projects

smowton pushed a commit to smowton/codeql that referenced this pull request Apr 16, 2022

Merge pull request github#264 from github/kotlin-sam-conversion

4f56d88

Extract SAM lambda conversion

MathiasVP added a commit to MathiasVP/ql that referenced this pull request Aug 10, 2025

Merge pull request github#264 from microsoft/simple-type-sanitizers

f8bdfa4

PS: Add simple type-based sanitizer to SQL injection query

		@@ -0,0 +1,3 @@
		LPWSTR pSrc;

		pSrc = (LPWSTR)"a"; No newline at end of file

		@@ -0,0 +1,29 @@
		/**
		* @name Cast between semantically different string types: char* from/to wchar_t*

C++: incorrect string type conversion #264

C++: incorrect string type conversion #264

Uh oh!

Conversation

raulgarciamsft commented Oct 1, 2018

Uh oh!

rdmarsh2 commented Oct 1, 2018

Uh oh!

raulgarciamsft commented Oct 1, 2018

Uh oh!

rdmarsh2 commented Oct 1, 2018

Uh oh!

raulgarciamsft commented Oct 1, 2018

Uh oh!

raulgarciamsft commented Oct 1, 2018

Uh oh!

jbj commented Oct 2, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raulgarciamsft commented Oct 2, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

geoffw0 commented Oct 8, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants