Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regular Expression to extract JSON objects from array

I’m working on a custom JSON deserializer in Java and would like to create an ArrayList of objects specified in such a .json file. For example, given the following file:

[
    {
        "name": "User1",
        "gender": "M"
    },
    {
        "name": "User2",
        "gender": "F"
    }
]

(…) I want my Java program to create a structure of two objects of class User, each of it holding the corresponding fields.

I managed to do it with only one value mentioned in the file (so no JSON array, just an object between {} and some key-value pairs), but with a list it gets more complicated. Thought about splitting the whole JSON array into all its elements, and apply my single JSON parsing algorithm to each of them, and then add them to an ArrayList.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My idea should work, but my problem is, I’m not that sure on how to properly split this array of JSONs using Java’s split() method for strings. I’m also not that good at regex expressions to think for a proper one.

Thought about splitting it based on: content.split("},"), and then appending the last } to the final element, but this is going to also split inside members of my JSON elements if they reference to other objects.

My question would be, what would be a proper regex, in this context, that is going to make Java properly split my JSON array into multiple JSON elements?

I can remove the brackets from the beginning and from the end of the file, this shouldn’t be a problem as it only requires easy String manipulation, but I also want a String[] array, each one containing one of my two users, together with their data.

Expected output:

String1: { "name": "User1", "gender": "M" }
String2: { "name": "User2", "gender": "F" }

>Solution :

If it’s pretty formatted as per your question, you can use:

(?s)(?<=^    )\{.*?(?<=^    )}

Here’s some test code:

String input ="[\n" +
        "    {\n" +
        "        \"name\": \"User1\",\n" +
        "        \"gender\": \"M\"\n" +
        "    },\n" +
        "    {\n" +
        "        \"name\": \"User2\",\n" +
        "        \"gender\": \"F\"\n" +
        "    }\n" +
        "]";
List<String> jsonObjects = Pattern.compile("(?sm)(?<=^    )\\{.*?(?<=^    )}")
  .matcher(input).results()
  .map(MatchResult::group)
  .map(str -> str.replaceAll("[\s\n]*(?!\",)", "")) // remove whitespace
  .collect(toList());

Output:

{"name":"User1","gender":"M"}
{"name":"User2","gender":"F"}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading