Collect part of string in a List in Java

There is a use case in which I have a long String which can contain many <img> tags.
I need to collect the entire image tag from start(<img src=") to close(">) in a List.

I wrote a regex("<img.*?\">"gm) for seleting these but don’t know how to collect them all in a List.

eg:

final String regex = "<img.*?\\\">";
final String string = "Hello World <img src=\"https://dummyimage.com/300.png/09f/777\"> \nMy Name <img src=\"https://dummyimage.com/300.png/09f/ff2\"> Random Text\nHello\nHello Random <img src=\"https://dummyimage.com/300.png/09f/888\"> \nMy Name <img src=\"https://dummyimage.com/300.png/09f/2ff\">adaad\n";
final String replace = "";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

final String result = matcher.replaceAll(replace); // Here, how can I collect all the image tags in a list

>Solution :

Use Matcher.results()

In the regular expression, you need to care about the opening angle bracket < (not quotation mark) to ensure that a captured substring contains only one tag:

public static final Pattern IMG_TAG =
    Pattern.compile("<img[^<]+>");
final String string = "Hello World <img src=\"https://dummyimage.com/300.png/09f/777\"> \nMy Name <img src=\"https://dummyimage.com/300.png/09f/ff2\"> Random Text\nHello\nHello Random <img src=\"https://dummyimage.com/300.png/09f/888\"> \nMy Name <img src=\"https://dummyimage.com/300.png/09f/2ff\">adaad\n";
    
List<String> imageTags = IMG_TAG.matcher(string).results()
    .map(MatchResult::group)
    .toList();
        
imageTags.forEach(System.out::println);

Output:

<img src="https://dummyimage.com/300.png/09f/777">
<img src="https://dummyimage.com/300.png/09f/ff2">
<img src="https://dummyimage.com/300.png/09f/888">
<img src="https://dummyimage.com/300.png/09f/2ff">

Leave a Reply