I’m trying to remove all instances of "universiti", "university", "university of", "college", and "college of" in my query but the "university of" and "college of" only replaces the first word and not second i.e. "of".
This is what I’ve tried
SELECT Name
, TRIM( REGEXP_REPLACE(Name, '(universiti|university|university\\sof|college|college\\sof)', '') ) AS re
FROM grid
Name |re |
---------------------------------------------------------------------+---------------------------------------------------------------------+
Australian National University |Australian National |
Monash University |Monash |
University of Queensland |of Queensland |
Macquarie University |Macquarie |
UNSW Sydney |UNSW Sydney |
Newcastle University |Newcastle |
University of Wollongong |of Wollongong |
University of Melbourne |of Melbourne |
University of Tasmania |of Tasmania |
University of Adelaide |of Adelaide |
James Cook University |James Cook |
University of Western Australia |of Western Australia |
University of Sydney |of Sydney |
Flinders University |Flinders |
How can I get rid of "of" too?
>Solution :
You can use
SELECT Name,
TRIM( REGEXP_REPLACE(Name, '\\b(universit[yi]|college)(\\s+of\\b\\s*)?', '') ) AS re
FROM grid
See the regex demo. Details:
\b– a word boundary(universit[yi]|college)–university,universiti,college(\s+of\b\s*)?– an optional sequence of\s+– one or more whitespacesof– anofword\b– a word boundary (so thatofcould not be a prefix of a longer word)\s*– zero or more whitespaces.
Testing with
create table grid (Name varchar(320));
insert into grid (Name) values ("University of Queensland");
insert into grid (Name) values ("Macquarie University");
insert into grid (Name) values ("UNSW Sydney");
the output is
