Here I have html contains hrefs as follows:
<a href="https://www.test.com/help">a</a><br/>
...
other html text
...
<a href="https://www.test.com/help2">b</a><br/>
...
other html text
...
<a href="https://www.test.com/help3">c</a>
How to use Regex to replace all .
character to space
character in href
?
<a href="https://www test com/help">a</a><br/>
<a href="https://www test com/help2">b</a><br/>
<a href="https://www test com/help3">c</a>
I am trying the follwing, but not work:
string textToHandle = "html text here";
string newUrl = string.Empty;
Regex rg = new Regex("(?<=(href=\"))[.\\s\\S]*?(?=(\"))", RegexOptions.IgnoreCase);
MatchCollection collection = rg.Matches(textToHandle);
foreach (Match match in collection)
{
string u = match.Groups[0].Value;
newUrl = rg.Replace(textToHandle, u.Replace(".", " "));
Console.WriteLine(newUrl);
}
Console.WriteLine(newUrl);
>Solution :
In order to replace all .
characters with spaces within the href attribute, you can modify your regex pattern to (?<=href="")[^""]*(?="")
and use the Regex.Replace()
method with a MatchEvaluator
:
string textToHandle = "<a href=\"https://www.test.com/help3\">c</a>";
string newUrl = string.Empty;
// Updated regex pattern to match href values
string pattern = @"(?<=href="")[^""]*(?="")";
Regex rg = new Regex(pattern, RegexOptions.IgnoreCase);
// Match evaluator to replace '.' with ' '
MatchEvaluator evaluator = match => match.Value.Replace(".", " ");
// Replace all occurrences in the input text
newUrl = rg.Replace(textToHandle, evaluator);
Console.WriteLine(newUrl);
Console.ReadKey();
Regex Explaination
(?<=href="")
Positive lookbehind to ensure the match is preceded byhref="
.[^""]*
Match any character except"
(double quotes), zero or more times.(?="")
Positive lookahead to ensure the match is followed by a double quote"
.