Is there a way to search a struct in a Spark?

In Spark Scala, I have Dataframe where one of column is struct type (state column below table)

JobID State
1 {"life_cycle_state": "RUNNING", "state_message": "In run"}
2 {"life_cycle_state":"INTERNAL_ERROR","state_message":"Notebook not found"}

Now, I want to filter dataframe with state column having ERROR.

Line of code which I’ve tried :

val errorJobRunsDF = df.filter(col("state").rlike("ERROR").select("JobID")

and it failed because rlike doesn’t work for struct type with below error:

Cannot resolve ‘state RLIKE ‘ERROR” due to data type mismatch: argument 1 requires string type, however, ‘state‘ is of struct<life_cycle_state:string,state_message:string> type.

Kindly suggest some work arounds.

>Solution :

In Spark, if you have a DataFrame with a struct column, you can search for specific values within the struct using the filter() function along with the appropriate column qualifiers.

from pyspark.sql.functions import col

# Replace 'search_value' with the value you want to search for
search_value = "search_value"

# Filter the DataFrame based on the search value
filtered_df = df.filter(col("myStruct.field1") == search_value)

# Show the filtered DataFrame
filtered_df.show()

Leave a Reply