I have column values that are in between starting of the underscore() and ending with an underscore().
I am trying to see how to extract a value between 2 underscores (_). for example, "xxxx_Whats your number 23345_xxxxx".
I want to discard everything before and after underscore(_).
Any help is greatly appreciated.
>Solution :
REGEXP_SUBSTRING, using a grouping match ( ) and turning on sub-matches 'e', and selecting the first match.. then stating you want to see an underscore, and then many not underscores, and then a underscore.
select
column1,
regexp_substr(column1, '_([^_]*)_',1,1,'e')
from values
('xxxx_Whats your number 23345_xxxxx')
gives:
| COLUMN1 | REGEXP_SUBSTR(COLUMN1, ‘([^]*)_’,1,1,’E’) |
|---|---|
| xxxx_Whats your number 23345_xxxxx | Whats your number 23345 |
hmm, you mention discard before and after, thus if you want to include the underscore you will need to move them into the grouping brackets:
select
column1,
regexp_substr(column1, '_([^_]*)_',1,1,'e') as exclude_underscore,
regexp_substr(column1, '(_[^_]*_)',1,1,'e') as include_underscore
from values
('xxxx_Whats your number 23345_xxxxx'),
('has no first underscore_xxxxx'),
('xxx_has no last underscore'),
('nothing between__the underscores');
| COLUMN1 | EXCLUDE_UNDERSCORE | INCLUDE_UNDERSCORE |
|---|---|---|
| xxxx_Whats your number 23345_xxxxx | Whats your number 23345 | _Whats your number 23345_ |
| has no first underscore_xxxxx | null | null |
| xxx_has no last underscore | null | null |
| nothing between__the underscores | __ |
then you might also want, atleast 1 character between the underscrores, and thus should change the * to a + or a {n,}