Regex for string cleaning

Kuler Master 246 Reputation points
2022-07-14T12:32:57.963+00:00

Hi there,

I have the following string to be cleaned of 1 and 2 character words unless it's the letter M or F.

EL EL EL EL M EL EL FU S FF EL EL EL E EL

I tried with the following expression but it didn't work quite well.

var output = Regex.Replace(input, @"\b\w{2}\s", "").Trim();  

I ended up with "M S E EL"

Could you help me so I can end up with only M?

Would be greatly appreciated. Thanks

ASP.NET Core
ASP.NET Core
A set of technologies in the .NET Framework for building web applications and XML web services.
4,207 questions
.NET CLI
.NET CLI
A cross-platform toolchain for developing, building, running, and publishing .NET applications.
323 questions
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,306 questions
VB
VB
An object-oriented programming language developed by Microsoft that is implemented on the .NET Framework. Previously known as Visual Basic .NET.
2,580 questions
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 112.5K Reputation points
    2022-07-14T13:36:21.243+00:00

    To consider letters only, try this: @"\b(\p{L}{2}|[^MF\s])\b\s*".


4 additional answers

Sort by: Most helpful
  1. Viorel 112.5K Reputation points
    2022-07-14T12:42:18.343+00:00

    Check this pattern: @"\b(\w{2,}|[^MF\s])\b\s*".

    0 comments No comments

  2. Kuler Master 246 Reputation points
    2022-07-14T13:00:09.21+00:00

    Hmm I thought that it works but then realized that it completely removes the regular words as well. e.g.

    Test (string)
    Jacob Has A Pen,
    EL EL EL EL M EL EL FU S FF EL EL EL E EL
    I do not have a Pen blah blah
    EL EL EL

    I end up with the following:

    ()
    ,
    M
    {two empty lines here}

    Any ideas how to ignore the other text and consider only the single letters M/F and TWO character words? Thank you again


  3. Viorel 112.5K Reputation points
    2022-07-14T13:24:30.267+00:00

    For 1- and 2-character words, try another pattern: @"\b(\w{2}|[^MF\s])\b\s*".

    If there are specific aspects, show details.

    0 comments No comments

  4. Kuler Master 246 Reputation points
    2022-07-14T13:32:22.73+00:00

    One more rule and I swear I will stop. How to omit the numbers e.g. Test (23%) to be ignore entirely.
    Currently I get Test (%)
    Thank you so much

    0 comments No comments