question

AJ-AJ avatar image
0 Votes"
AJ-AJ asked AJ-AJ edited

Powershell to clean up a file

Hi there,

I'm requesting help on powershell script to clean up a input file and write to an output file.

Contents of the input.txt file are..

abcd
cdef
["abcd"]
["ab
cd"]

 ["efgh"]

xyz

Expected OUTPUT.txt file

abcd
cdef
["abcd"]
["abcd"]

 ["efgh"]

xyz

Wanted to suggest the criteria to clean up the file is
If a line contains start square bracket but does not end with square bracket, then concatenate the next line to current line.

Looking forward Appreciate your help towards script.

Thanks.

windows-server-powershell
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

RichMatheisen-8856 avatar image
0 Votes"
RichMatheisen-8856 answered AJ-AJ commented

Assuming you don't have a sequence like this that extends over more than two adjacent lines:

[ab
cd
ef]

Then this should work:


 $file = "c:\junk\test.txt"
 $BufferedLine = $null
    
 Get-Content $file |
     ForEach-Object {
         if ($_ -match "^[^[].*[^]]$") {
             # line doesn't begin with "[" or end with "]"
             $_                                  # -- line is okay, just return it
         }
         elseif ($_ -match "^\[.*\]$") {
             # line begins with "[" and ends with "]"
             $_                                  # -- line is okay, just return it
         }
         elseif ($_ -match "^\[.*[^]]$") {       # line begins with "[" but doesn't end with "]"
             $BufferedLine = $_                  # remember line as beginning
         }
         elseif ($_ -match "^[^[].*\]$") {
             # line doesn't begin with "[" but ends with "]"
             if ($BufferedLine) {                # AND there's a preceeding line awaiting closure
                 $BufferedLine += $_             # concatenate with contents of previous line
                 $BufferedLine                   # return completed line
                 $BufferedLine = $null           # and forget the value
             }
         }
         elseif ($_.length -eq 0){
             $_
         }
     }
 if ($BufferedLine) {
     $BufferedLine                           # return the last line of necessary
 }


· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you for your help. I'm afraid that the above skipped the line and few other line all together.

Input.txt

abcd
cdef
["abcd"]
["ab
cd"]

["efgh"]
xyz

Output per above code (missed out few lines)

abcd
cdef
["ab
xyz

Expected Output.txt

abcd
cdef
["abcd"]
["abcd"]

["efgh"]
xyz

Greatly appreciated

0 Votes 0 ·

Having a file attached to your original post, rather than a copy/paste, to work with would have been helpful -- or a copy/paste using the "Code Sample" editor -- would have been helpful.

It wasn't obvious that there were empty lines in your dataset.

I added another conditional statement to the code I posted earlier. The data you posted now returns this:

 abcd
 cdef    
 ["abcd"]
 ["abcd"]
    
 ["efgh"]
 xyz
0 Votes 0 ·
AJ-AJ avatar image AJ-AJ RichMatheisen-8856 ·

Rich.. you are the man.. appreciate your help. I agree. I should have attached the input file. My bad. cant thank you enough for the above help.

At this point i would like to mark yours as answered but not sure if thread will be frozen.

If I may bother you with a scenario which I found just now. I have attached a revised input file 154523-input.txt which has few other special characters and lines which should not be disturbed. How to look only for the pattern [" and "] for concating rows. No changes on the rest of the lines and written to output file as is. 154533-expectedoutput.txt

I'm still playing around with the match criteria from the above.


0 Votes 0 ·
Show more comments
AJ-AJ avatar image
0 Votes"
AJ-AJ answered AJ-AJ edited

This below gives the same result expected.

$file = Get-Content -Path "c:\junk\test.txt" -Raw
$file = $file -ireplace '(?<match1>[[^]])\r\n(?<match2>[^]]])','${match1}${match2}'
$file | Out-File -FilePath "C:\Junk\Results.txt"

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

To search for [" (start square bracket with quotes) anywhere in a line which does not contain close square bracket with quotes "]

$file = $file -ireplace '(?<match1>[\"[^]])\r\n(?<match2>[^]]\"])','${match1}${match2}'

0 Votes 0 ·
AJ-AJ avatar image
0 Votes"
AJ-AJ answered

To search for [" (start square bracket with quotes) anywhere in a line which does not contain close square bracket with quotes "]

$file = $file -ireplace '(?<match1>[\"[^]])\r\n(?<match2>[^]]\"])','${match1}${match2}'

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.