PowerShell - Copy strings from some html tags to another html tags

Suzana Eree 811 Reputation points
2021-03-10T09:48:50.13+00:00

I have this

<ul id="myNavigation">
    <li><a href="https://my-website.com/page-1.html" title="Page 1">Page 1 (34)</a></li>
</ul>

I copy and replace the LINK, TITLE and NUMBER to this new css:

<div class="categories-name">
   <a href="https://my-website.com/page-66.html" title="Page 66">
   <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 66 <span>27</span> </p>
  </a>
</div>

I wonder if something like this can be done in powershell

The output should be:

<div class="categories-name">
   <a href="https://my-website.com/page-1.html" title="Page 1">
   <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 1 <span>34</span> </p>
  </a>
</div>
Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,383 questions
0 comments No comments
{count} votes

Accepted answer
  1. Ian Xue (Shanghai Wicresoft Co., Ltd.) 30,361 Reputation points Microsoft Vendor
    2021-03-11T07:41:26.037+00:00

    Hi @Suzana Eree ,

    You can implement the loop like this

    $file1 = 'D:\temp\file1.html'  
    $file2 = 'D:\temp\file2.html'  
    $result = 'D:\temp\result.html'  
    $link=@()  
    $title=@()  
    $number=@()  
    Get-Content -Path $file1 -Delimiter '</li>'|ForEach-Object{  
        $_|ForEach-Object{  
            if($_ -match '(?<=href=").+?(?=")'){$link += $Matches.Values}  
            if($_ -match '(?<=title=").+?(?=")'){$title += $Matches.Values}  
            if($_ -match '(?<=\()\d+(?=\))'){$number += $Matches.Values}  
        }  
    }  
    $content = Get-Content -Path $file2 -Delimiter '</div>'  
    for($i=0;$i -lt $content.Count;$i++){  
        $content[$i] | ForEach-Object{  
            if($_ -match '(?<=href=").+?(?=")'){$link2 = $Matches.Values}  
            if($_ -match '(?<=title=").+?(?=")'){$title2 = $Matches.Values}  
            if($_ -match '(?<=<span>)\d+(?=</span>)'){$number2 = $Matches.Values}  
        }  
        $content[$i] -replace $link2, $link[$i] -replace $title2, $title[$i] -replace $number2, $number[$i] | Out-File -FilePath $result -Append  
    }   
    

    Best Regards,
    Ian Xue

    ============================================

    If the Answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


3 additional answers

Sort by: Most helpful
  1. Ian Xue (Shanghai Wicresoft Co., Ltd.) 30,361 Reputation points Microsoft Vendor
    2021-03-10T15:02:10.89+00:00

    Hi @Suzana Eree ,

    You can try some regexes like below

    $file1 = 'D:\temp\file1.html'  
    $file2='D:\temp\file2.html'  
    $result = 'D:\temp\result.html'  
     Get-Content -Path $file1|ForEach-Object{  
        if($_ -match '(?<=href=").+?(?=")'){$link = $Matches.Values}  
        if($_ -match '(?<=title=").+?(?=")'){$title = $Matches.Values}  
        if($_ -match '(?<=\()\d+(?=\))'){$number = $Matches.Values}  
    }  
    $content = Get-Content -Path $file2  
    $content | ForEach-Object{  
        if($_ -match '(?<=href=").+?(?=")'){$link2 = $Matches.Values}  
        if($_ -match '(?<=title=").+?(?=")'){$title2 = $Matches.Values}  
        if($_ -match '(?<=<span>)\d+(?=</span>)'){$number2 = $Matches.Values}  
    }  
    $content -replace $link2, $link -replace $title2, $title -replace $number2, $number | Out-File -FilePath $result  
    

    Best Regards,
    Ian Xue

    ============================================

    If the Answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Suzana Eree 811 Reputation points
    2021-03-10T15:33:05.567+00:00

    hi @Ian Xue (Shanghai Wicresoft Co., Ltd.)

    thank you, super answer ! But, maybe you have a complete solution in the case there are much lines, the same number of lines, the same structure, except the strings that needs to be parsing. I believe a Loop is need it ! Something like THIS or THIS

    <ul id="myNavigation">  
    <li><a href="https://my-website.com/page-1.html" title="Page 1">Page 1 (34)</a></li>  
    <li><a href="https://my-website.com/page-2.html" title="Page 2">Page 2 (29)</a></li>  
    <li><a href="https://my-website.com/page-3.html" title="Page-3">Page 3 (11)</a></li>  
    ....  
    <li><a href="https://my-website.com/page-40.html" title="Page-4">Page 4 (54)</a></li>  
    </ul>  
    

    AND THE SECOND PART:

    <div class="categories-name">  
            <a href="https://my-website.com/page-66.html" title="Page 66">  
            <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 66 <span>27</span> </p>  
         </a>  
    </div>  
    <div class="categories-name">  
         <a href="https://my-website.com/page-67.html" title="Page 67">  
         <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 67 <span>24</span> </p>  
    </a>  
    </div>  
    <div class="categories-name">  
        <a href="https://my-website.com/page-68.html" title="Page 68">  
        <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 68 <span>07</span> </p>  
        </a>  
    </div>  
    .....  
    <div class="categories-name">  
        <a href="https://my-website.com/page-100.html" title="Page 100">  
        <p class="font-16 color-grey text-capitalize"><i class="fa fa-angle-right font-14 color-blue mr-1"></i> Page 100 <span>67</span> </p>  
    </a>  
    </div>  
    
    0 comments No comments

  3. Suzana Eree 811 Reputation points
    2021-03-11T07:49:16.493+00:00

    thanks, WORKS very good.

    but the html files must be create in UTF-8 Encoding, otherwise I get this error :

    powershell https://snipboard.io/aD7sAm.jpg

    notepad++ https://snipboard.io/PixD9F.jpg

    0 comments No comments