Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to convert a dictionary which is in string format to tabular dataframe in scala?

I have an method which return a string and the value is like dictionary. E.g type is string and the return value is:

{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}

I want to convert this to a dataframe which will have two column firstName and LastName.

For now i am only able to store it as a single column in dataframe using .toDF()

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

val df=Seq(retrunString).toDF("record");

Can some one help on this.

>Solution :

You can use the from_json function from Spark’s functions package to parse the JSON string into a struct:

import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import spark.implicits._

val jsonString = """{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}"""

val df = Seq(jsonString).toDF("record")

val schema = StructType(
  Seq(
    StructField("firstName", StringType),
    StructField("lastName", StringType)
  )
)

val parsedDf = df
  .select(from_json(col("record"), schema).as("parsed"))
  .select("parsed.firstName", "parsed.lastName")

parsedDf.show()

+------------+----------------+
|   firstName|        lastName|
+------------+----------------+
|bb288e8ff56b|ae4863bdae026314|
+------------+----------------+
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading