Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to convert timestamp to AWS data lake s3 timestamp

In my data frame I have timestamp column with values in format: "%Y-%m-%d %H:%M:%m". I want to save this Data Frame to parquet using:
df.write.partitionBy('p_timestamp').mode("overwrite").parquet('output/proto.parquet')
and I want to have timestamp partition saved in AWS data lake s3 timestamp format: "%Y-%m-%dT%H-%M"

I tried:

df.select(F.col("p_timestamp"), F.date_format(F.col("p_timestamp"), "%Y-%m-%dT%H-%M").alias("date_format")).show()

but I got an error:
Illegal pattern character 'T'

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I would be grateful for any tips.

>Solution :

I think the formatting string you are looking for is:

"yyyy-MM-dd'T'HH-mm"

You get the error because the T is not quoted with ' to mark it as a string.
You can consult sparks Datetime Patterns for Formatting and Parsing if you need something a little different.

Small example:

import pyspark.sql.types as T
schema = T.StructType([T.StructField("Time", T.StringType(), False)])
df = spark.createDataFrame([('1970-09-01 02:02:10',)], schema=schema)
df.withColumn("Convert", F.date_format(F.col("Time"), "yyyy-MM-dd'T'HH-mm")).show()

Output:

+-------------------+----------------+
|               Time|         Convert|
+-------------------+----------------+
|1970-09-01 02:03:10|1970-09-01T02-03|
+-------------------+----------------+
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading