In Spark Scala, I have Dataframe where one of column is struct type (
state column below table)
Now, I want to filter dataframe with
state column having ERROR.
Line of code which I’ve tried :
val errorJobRunsDF = df.filter(col("state").rlike("ERROR").select("JobID")
and it failed because rlike doesn’t work for struct type with below error:
Cannot resolve ‘
stateRLIKE ‘ERROR” due to data type mismatch: argument 1 requires string type, however, ‘
state‘ is of struct<life_cycle_state:string,state_message:string> type.
Kindly suggest some work arounds.
In Spark, if you have a DataFrame with a struct column, you can search for specific values within the struct using the filter() function along with the appropriate column qualifiers.
from pyspark.sql.functions import col # Replace 'search_value' with the value you want to search for search_value = "search_value" # Filter the DataFrame based on the search value filtered_df = df.filter(col("myStruct.field1") == search_value) # Show the filtered DataFrame filtered_df.show()