question

mrontzthedev-1624 avatar image
0 Votes"
mrontzthedev-1624 asked RichMatheisen-8856 edited

Is this even possible???

I am currently stuck on a pretty significant issue, so much so that it seems even Google does not have an answer for it. Here is my situation. I have a program that takes an html file as input and converts the tags into AMP-valid format. For some reason, after conversion, it bunches up all the code onto a single line, so I have to go in, scroll to each tag, and press [enter] in order to move the tag onto a new line. My question is this, how the heck do I write a mini-script that can run after the conversion to do this one simple function? For the life of me, I can not figure it out. Someone please help!

windows-server-powershell
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

RichMatheisen-8856 avatar image
0 Votes"
RichMatheisen-8856 answered

I'm assuming that "AMP format" is a modified form of HTML? If that's true, see if something like this works for you:

 $HTML = New-Object -Com "HTMLFile"
 $src = Get-Content c:\junk\x.html -Raw
 $HTML.IHTMLDocument2_write($src)
 $HTML.documentElement.outerHTML |
     Out-File c:\junk\NewX.html

Note that using COM is known for being persnicketie, and the HTMLFile COM object uses (IIRC) the Internet Explorer HTML parser -- so be prepared for possible parsing problems!

Another choice may be the HTMAgility package . . . it's not something I've used but it seems to be better than that COM stuff. Here's an example using PowerShell: html-agility-pack-rocks-your-screen-scraping-world


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

IanXue-MSFT avatar image
0 Votes"
IanXue-MSFT answered RichMatheisen-8856 edited

Hi,

If it's an html file you can try this

 $input = "C:\temp\input.html"
 $output = "C:\temp\output.html"
 (Get-Content -Path $input) -replace "<(?!/)","`r`n<" | Out-File -FilePath $output


Best Regards,
Ian Xue
============================================
If the Answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

The problem with naïvely inserting CrLf is that you may encounter a lone "<" character inside a tag, or in quoted strings.

For example:

<em title="PW<UW">this code</em>


0 Votes 0 ·