I have a file that is formatted this way —
{2000}000000012199{3100}123456789*{3320}110009558*{3400}9876
54321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX
78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLAS
TX 73920**
Basically, the number in curly brackets denotes field, followed by the value for that field. For example, {2000} is the field for "Amount", and the value for it is 121.99 (implied decimal). {3100} is the field for "AccountNumber" and the value for it is 123456789*.
I am trying to figure out a way to split the file into "records" and each record would contain the record type (the value in the curly brackets) and record value, but I don’t see how.
How do I do this without a loop going through each character in the input?
>Solution :
This regular expression should get you going:
- Match a literal
{ - Match 1 or more digts ("a number")
- Match a literal
} - Match all characters that are not an opening
{
\{\d+\}[^{]+
It assumes that the values itself cannot contain an opening curly brace. If that’s the case, you need to be more clever, e.g. @"\{\d+\}(?:\\{|[^{])+" (there are likely better ways)
Create a Regex instance and have it match against the text. Each "field" will be a separate match
var text = @"{123}abc{456}xyz";
var regex = new Regex(@"\{\d+\}[^{]+", RegexOptions.Compiled);
foreach (var match in regex.Matches(text)) {
Console.WriteLine(match.Groups[0].Value);
}