Looking for the proper notepad++ regex search string that will find both regular ASCII and non-ASCII characters in the same string.
Currently using 2-3 finds to track down these non-ASCII characters.
The search string text always resides in-between square brackets [ ].
[Alfirimë Nóri]
[Sadoc panting]
[Eärien shouting]
[Queen Míriel] One, two, three,
[The Stranger gasping]
Need a SINGLE regex search string to find both ASCII and non-ASCII. Also needs to find the strings that have NO non-ASCII characters.
Find 1: \[([A-Z]*(?:(?:\h*|-)[A-Z0-9.#&',íáéíóôúüñÁÉÍÓÚÜÑÇçåø][a-z]*)*)\]
This only finds strings that have non-ASCII characters in the string:
Find 2: \[([A-Z]*(?:(?:\h*|-)[^\x00-\x7F][a-z]*)*)\]
I noticed that the Find 2: above did not find this .. í … "Míriel" two different in same name
Is it possible to have a SINGLE regex search string to find both ASCII and non-ASCII. And also find strings that have NO non-ASCII characters.
Any improvements in the above would be greatly appreciated.
Thanks in advance
Edit: Find 2 didn’t find "Míriel" because the first name/word encountered didn’t have non-ASCII characters in it.
>Solution :
You can use
\[(?=[^][A-Za-z]*[A-Za-z])(?=[^][]*[^[:^alpha:]A-Za-z])[^][]*]
See the regex demo. NOTE: to match the strings between [...] that do not contain non-ASCII letters, replace the second (?= with (?!:
\[(?=[^][A-Za-z]*[A-Za-z])(?![^][]*[^[:^alpha:]A-Za-z])[^][]*]
See this regex demo.
Details:
\[– a[char(?=[^][A-Za-z]*[A-Za-z])– after zero or more chars other than ASCII letters,[and], there must be an ASCII letter(?=[^][]*[^[:^alpha:]A-Za-z])– after zero or more chars other than[and], there must be a non-ASCII letter[^][]*– zero or more chars other than[and]]– a]char.
