Skip to content

Conversation

@johandahlberg
Copy link
Contributor

It seems that perhaps it is necessary to change the charset to UTF-8 in two places to get the body correctly encoded (exactly why this is so is a little bit unclear to me, I solved this by trail and error). I also added that the subject should be utf-8 encoded, to make sure that non-ascii characters are allowed there.

@Kami
Copy link
Member

Kami commented Feb 5, 2019

Thanks. I will have a look and merge those changes directly into my branch.

cat <<EOF
--${boundary}
Content-Type: ${CONTENT_TYPE}
Content-Type: ${CONTENT_TYPE}; charset=UTF-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add it here because I thought we don't need it here. This section refers to attachments which should alrady be base64 encoded so it should only contain ascii characters.

Having said that, if it works fine and has no negative side affects, I'm fine with keeping it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is a bit odd, but when I did not include this, I got garbled non-ascii characters in the e-mail. Not sure why though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I was wrong, this part also refers to the html version of the body. The section above refers to the attachments so it's correct to have charset here as well.

TO: ${TO}
FROM: ${FROM}
SUBJECT: ${SUBJECT}
SUBJECT: =?UTF-8?q?${SUBJECT}?=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this works fine with all content types - text/html (default) and text/plain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly based on this, https://tools.ietf.org/html/rfc1342, the encoding of the header is separate from the encoding of the message (which is why this needs to be specified separately). I've tested text/plain and 'text/html' and both seem to work fine.

TO: ${TO}
FROM: ${FROM}
SUBJECT: ${SUBJECT}
SUBJECT: =?UTF-8?q?${SUBJECT}?=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we might want to base64 encode the subject and change q to B.

q specifies subject is encoded using quoted-printable encoding which is not the case so it probably won't work correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I've updated the PR here.

@Kami Kami merged commit d163a15 into StackStorm:fix_unicode_issue_in_serialize_positional_arguments Feb 5, 2019
@Kami
Copy link
Member

Kami commented Feb 5, 2019

LGTM, thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants