Skip to content

Conversation

@RandomNoun7
Copy link

This change allows the Get-FileEncoding cmdlet to run on PowerShell core
and thus on Linux.

The encoding parameter to get content has changed in core so that 'byte'
is no longer valid.

The alias for Sort-Object to simply 'Sort' has also been removed in
the latest versions of PS Core. This change also removes the 'Select'
alias, even though it isn't technically removed yet, because there is
some discussion around removing all of these aliases in the default
PS Core environments, so best to get rid of it now.

PowerShell/PowerShell#5870

@RandomNoun7 RandomNoun7 force-pushed the get-fileencoding-ps-core-fixes branch from 36ad67c to a15517b Compare April 17, 2019 22:00
@RandomNoun7
Copy link
Author

RandomNoun7 commented Apr 17, 2019

Forgive me if you already know this, but you can't see the diff because the file is encoded in UTF-16LE. If you want to convert the file to UTF-8 in a new commit, or if you would like me to do so in a different PR, I'm happy to wait for that and rebase this change on top of it.

In the meantime the changes are as follows.

Line 2056 from:

foreach($encodingLength in $encodingLengths | Sort -Descending)

To:

foreach($encodingLength in $encodingLengths | Sort-Object -Descending)

Line 2058 from:

$bytes = Get-Content -encoding byte -readcount $encodingLength $path | Select -First 1

To:

$bytes = Get-Content -raw -readcount $encodingLength $path | Select-Object -First 1

@RandomNoun7
Copy link
Author

I think this needs further testing. It looks like this call to Get-Content may be stripping the byte order mark from files that start with one.

This change allows the Get-FileEncoding cmdlet to run on PowerShell core
and thus on Linux.

The encoding parameter to get content has changed in core so that 'byte'
is no longer valid.

Unfortunately I had to resort to a stream reader since Get-Content likes
to strip a BOM if it exists. The -raw switch only ignores line endings.

The alias for Sort-Object to simply 'Sort' has also been removed in
the latest versions of PS Core. This change also removes the 'Select'
alias, even though it isn't technically removed yet, because there is
some discussion around removing all of these aliases in the default
PS Core environments, so best to get rid of it now.

PowerShell/PowerShell#5870
@RandomNoun7 RandomNoun7 force-pushed the get-fileencoding-ps-core-fixes branch from a15517b to d759724 Compare April 17, 2019 23:30
@RandomNoun7
Copy link
Author

Ok, this version works in PS 5.1 and PS Core on Windows and Linux.

Unfortunately I had to resort to a stream reader as the -raw switch on Get-Content only ignores line endings, but it still only wants to return what it considers the usable data, so it strips the BOM if it exists, breaking the encoding matcher.

The new call on line 2058 is:

$bytes = [system.io.file]::ReadAllBytes($path) | Select-Object -first $encodingLength

Reading all bytes isn't the greatest strategy, but the alternative of manually handling a memory stream seems a little heavy handed for an example like this.

Ready for merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant