Databricks SQL regex_extract/split field before and after specific wordword

I am working on Databricks SQL and trying to parse strings into separate columns.

Here is an example that is working for me:

regexp_extract('100-200', '(\\d+)-(\\d+)', 1)
returns: 100

My specific example is splitting a column called match_name (i.e Detroit Lions at New York Jets) so I can get home and away team into their own separate columns (i.e AWAY = Detroit Lions, HOME = New York Jets)

Here is some code I have tried to implement against the match_name column in my SQL query to attempt to split at the word "at" – none of it currently works

split(match_name,'[at]', 1),
regexp_extract(match_name,'(\\d+)at(\\d+)', 1),

Any help is appreciated!

>Solution :

I’ve run into this frustration with Databricks as well, as I thought split_part was a valid function. At any rate, here’s one way to use regexp_extract…

with my_table as (
select 'Team A at Team B' as game union all
select 'Atlanta at Dallas'
select game, 
 regexp_extract(game, "(.*) at (.*)", 1) as away_team, 
 regexp_extract(game, "(.*) at (.*)", 2) as home_team
from my_table


game away_team home_team
Team A at Team B Team A Team B
Atlanta at Dallas Atlanta Dallas

Leave a Reply