question

JohnRounds-0053 avatar image
0 Votes"
JohnRounds-0053 asked IanXue-MSFT edited

Powershell - Get-Content Asian characters show as question marks

Hi,
Using Powershell to grab an HTML file and throw it in an email as the body.
Everything (images, etc.) work fine except Asian characters. They show up as question marks.
Trying this:
$body = Get-Content -Path $bodyPath -Raw -Encoding Unicode

If I open the HTML file directly, the characters show up fine.

Any suggestions would be appreciated.
Thanks!

windows-server-powershell
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

AndreasBaumgarten avatar image
0 Votes"
AndreasBaumgarten answered

Hi @JohnRounds-0053 ,

could you please try the -encoding UTF8 .


(If the reply was helpful please don't forget to upvote and/or accept as answer, thank you)

Regards
Andreas Baumgarten

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

cooldadtx avatar image
0 Votes"
cooldadtx answered

Is the HTML actually Unicode? While Unicode can handle pretty much every written language it isn't necessarily used just because something is written in a non-English language. MCBS has been around for decades and is used for non-English languages as well. I should also point out that Unicode is actually utf16 and is not a single format. UTF-8 and UTF-32 are also available. If you pick the wrong one then it'll mangle the text.

For an HTML document the charset being used is supposed to be contained in a metadata tag in the head of the HTML. This tells the browser the language the HTML is designed for and is used by browsers to determine what character set to use. Look at the metadata element and use the corresponding encoding when getting the content. For example if the charset is set to utf-8 then use utf8 or possibly utf8BOM instead as documented here.

If your encoding is correct for the HTML charset then look at the text as it appears in PS to see if it is correct there. If it is correct in PS then the issue is with the sending to the mail client and/or how the mail client renders it.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

JohnRounds-0053 avatar image
0 Votes"
JohnRounds-0053 answered

Thanks!
I've tried both, and the utf8 didn't change anything and the utf8BOM came up w/ error: the argument is null or empty.

I'll see what PS shows in the text.

This same HTML file was used with a previous app that sent it though email. We're taking it away from that app and using PowerShell to send it ourselves. So, I 'assume' the HTML file is ok as it is the same file. But I'll check it before it heads to email.

Appreciate the suggestions.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

IanXue-MSFT avatar image
0 Votes"
IanXue-MSFT answered IanXue-MSFT edited

Hi,

It could be a font issue. Try to set the font of PowerShell to SimSun-ExtB.

112729-image.png

Best Regards,
Ian Xue
============================================
If the Answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.



image.png (22.9 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.