Hi All,
Windows 10 - Windows PS 5.1
We have local HTML documents produced with an authoring tool (Madcap flare) which produces well formatted HTML. Unfortunately, it adds non-printing HTML Characters (   ) within <code> tags that then cause issues when users copy and paste text into another application. I want to run a post process PS script to parse the HTML, pull out the <code> tags, and replace the text within those tags, but ONLY within those tags.
I have tried to create an HTMLFile and see if I could pull out the text within the tags using the following code, but it returns nothing. I even have tried some real basic HTML as well and just looked at <p> tags, but still it results in nothing.
$Source = Get-Content -path example.html -raw
$HTML = New-Object -Com "HTMLFile"
$HTML.IHTMLDocument2_write($Source)
$HTML.all.tags("code") | % InnerText
I was think that perhaps converting the HTML to XML using ConvertTo-XML, which seems to work, but then I'm not sure where to go next.
I tried to upload some example HTML, but these forums will not allow me to do that, either in line or as a txt file :(
Even tried DropBox and OneDrive links - both are denied :(

