Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

regular expression property title case (Lt)

I use the property Lt to match a capitalized letter at the start of a word (title case).

My regular expression (regex101.com) is only the property \p{Lt} and my test string is Title Case.

The result is: no match. The properties Ll and Lu give correct results. What is the reason for this behavior?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

\p{Lt} only matches the Unuicode letters from the Lt cateogry:

U+01C5   Dž   Latin Capital Letter D with Small Letter Z with Caron
U+01C8   Lj   Latin Capital Letter L with Small Letter J
U+01CB   Nj   Latin Capital Letter N with Small Letter J
U+01F2   Dz   Latin Capital Letter D with Small Letter Z
U+1F88   ᾈ   Greek Capital Letter Alpha with Psili and Prosgegrammeni
U+1F89   ᾉ   Greek Capital Letter Alpha with Dasia and Prosgegrammeni
U+1F8A   ᾊ   Greek Capital Letter Alpha with Psili and Varia and Prosgegrammeni
U+1F8B   ᾋ   Greek Capital Letter Alpha with Dasia and Varia and Prosgegrammeni
U+1F8C   ᾌ   Greek Capital Letter Alpha with Psili and Oxia and Prosgegrammeni
U+1F8D   ᾍ   Greek Capital Letter Alpha with Dasia and Oxia and Prosgegrammeni
U+1F8E   ᾎ   Greek Capital Letter Alpha with Psili and Perispomeni and Prosgegrammeni
U+1F8F   ᾏ   Greek Capital Letter Alpha with Dasia and Perispomeni and Prosgegrammeni
U+1F98   ᾘ   Greek Capital Letter Eta with Psili and Prosgegrammeni
U+1F99   ᾙ   Greek Capital Letter Eta with Dasia and Prosgegrammeni
U+1F9A   ᾚ   Greek Capital Letter Eta with Psili and Varia and Prosgegrammeni
U+1F9B   ᾛ   Greek Capital Letter Eta with Dasia and Varia and Prosgegrammeni
U+1F9C   ᾜ   Greek Capital Letter Eta with Psili and Oxia and Prosgegrammeni
U+1F9D   ᾝ   Greek Capital Letter Eta with Dasia and Oxia and Prosgegrammeni
U+1F9E   ᾞ   Greek Capital Letter Eta with Psili and Perispomeni and Prosgegrammeni
U+1F9F   ᾟ   Greek Capital Letter Eta with Dasia and Perispomeni and Prosgegrammeni
U+1FA8   ᾨ   Greek Capital Letter Omega with Psili and Prosgegrammeni
U+1FA9   ᾩ   Greek Capital Letter Omega with Dasia and Prosgegrammeni
U+1FAA   ᾪ   Greek Capital Letter Omega with Psili and Varia and Prosgegrammeni
U+1FAB   ᾫ   Greek Capital Letter Omega with Dasia and Varia and Prosgegrammeni
U+1FAC   ᾬ   Greek Capital Letter Omega with Psili and Oxia and Prosgegrammeni
U+1FAD   ᾭ   Greek Capital Letter Omega with Dasia and Oxia and Prosgegrammeni
U+1FAE   ᾮ   Greek Capital Letter Omega with Psili and Perispomeni and Prosgegrammeni
U+1FAF   ᾯ   Greek Capital Letter Omega with Dasia and Perispomeni and Prosgegrammeni
U+1FBC   ᾼ   Greek Capital Letter Alpha with Prosgegrammeni
U+1FCC   ῌ   Greek Capital Letter Eta with Prosgegrammeni
U+1FFC   ῼ   Greek Capital Letter Omega with Prosgegrammeni

See the regex demo.

What you want is \b\p{Lu}, the regex will match any uppercase letter that is not immediately preceded with a word char.

See the regex demo.

Depending on what contexts you want to math the uppercase letter in, the regex can also look like

  • (?<!\p{L})\p{Lu} – an uppercase letter not immediately preceded with any letter
  • (?<!\S)\p{Lu} – an uppercase letter not immediately preceded with a non-whitespace char.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading