The following pattern matches a line that starts with ‘v’ followed by an arbitrary number of floats:
const RegexOptions options = RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.CultureInvariant;
var regex = new Regex(@"^\s*v((?:\s+)[-+]?\b\d*\.?\d+\b)+$", options);
const string text = @"
v +0.5 +0.5 +0.5 0.0 1.0 1.0
v +0.5 -0.5 -0.5 1.0 0.0 1.0
v -0.5 +0.5 -0.5 1.0 1.0 0.0
v -0.5 -0.5 +0.5 0.0 0.0 0.0
";
using var reader = new StringReader(text);
for (var s = reader.ReadLine(); s != null; s = reader.ReadLine())
{
if (string.IsNullOrWhiteSpace(s))
continue;
var match = regex.Match(s);
if (match.Success)
{
foreach (Capture capture in match.Groups[1].Captures)
{
Console.WriteLine($"'{capture.Value}'");
}
}
}
It works as expected except that it includes the leading space before a number:
' +0.5'
' +0.5'
' +0.5'
' 0.0'
' 1.0'
' 1.0'
...
Question:
How can I ignore the leading space for each captured number?
>Solution :
You could change the regex to match the whitespace chars instead of capturing.
This part (?:\s+)
is the same as just \s+
and as you repeat the pattern with 1 or more whitspace chars you can omit the word boundary \b
at the end.
Note that in C# \d
can match more than [0-9]
^\s*v(?:\s+([-+]?\b\d*\.?\d+))+$
The line in C# would be:
var regex = new Regex(@"^\s*v(?:\s+([-+]?\b\d*\.?\d+))+$", options);
Output
'+0.5'
'+0.5'
'+0.5'
'0.0'
'1.0'
'1.0'
'+0.5'
'-0.5'
'-0.5'
'1.0'
'0.0'
'1.0'
'-0.5'
'+0.5'
'-0.5'
'1.0'
'1.0'
'0.0'
'-0.5'
'-0.5'
'+0.5'
'0.0'
'0.0'
'0.0'