Advertisements
I’m trying to split a text into a list of words that a word should contain only letters.
I tried this pattern [^a-zA-Z]+
like below:
var regex = new Regex(@"[^a-zA-Z]+", RegexOptions.Singleline | RegexOptions.Compiled);
var words = regex.Split(text).Where(w => !string.IsNullOrEmpty(w))
When the input is This is a t3st
, it returns ["This", "is", "a", "t", "st"]
but I’m looking for ["This", "is", "a"]
result.
I implemented it in this way:
var words = text.Split(' ', StringSplitOptions.RemoveEmptyEntries)
.Where(str => str.All(char.IsLetter))
.ToList();
However, looking for a regex solution.
>Solution :
I don’t know C# in particular, but this should work (should be matched against the string):
(?<=^| ) # Beginning of line or preceding space
(?: #
(?=[a-z]) # ...which is a letter
. # Match any character...
)+ # 1 or more times
(?= |$) # End of line or succeeding space