I have a text column in my Postgres table and I want to remove any non-alphabetical characters and split it by a space (so I can use each word to do a search later).
I can remove the characters and split it successfully however my rows have empty results. I cannot have any empty words:
SELECT
asset.id,
string_to_array(TRIM(regexp_replace(asset.title, '[^a-zA-Z ]', '', 'g')), ' ')
FROM
assets asset
eg.
Hello world! becomes {Hello,world}
but also Some Result V1.0 - 3D Model becomes {Some,Result,V,,D,Model
How do I filter my array to remove empty words?
>Solution :
Try using a multicharacter regular expression
Try with this:
SELECT
asset.id,
string_to_array(TRIM(regexp_replace(asset.title, '[^a-zA-Z]+', ' ', 'g')), ' ')
FROM
assets asset
I just removed the whitespace from the regex in order to capture every non-alphabet character. Then, the + indicates one or more coincidences, which ensures than any consecutive non-alphabetical character will get replaced with a single space. Finally, as you already do the trim, your split by whitespace will work.