Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Case when for statement with multiple grouped conditions converted from Pyspark

I am converting a PySpark dataframe into SQL and am having a hard time converting

.withColumn("portalcount", when(((F.col("tCounts") == 3) & (F.col("Type1").contains("pizza"))) & ((~(F.col("Type1").contains("singleside"))) | (~(F.col("Type1").contains("side")))), 2)
           .when(((F.col("tCounts") == 3) & (F.col("Type1").contains("pizza"))) & ((F.col("Type1").contains("singleside")) | (F.col("Type1").contains("side"))), 1)

to

CASE
WHEN (tCounts = 3 AND Type1 IN 'pizza') AND (Type1 NOT IN 'singleside' OR Type1 NOT IN 'side') then 2
WHEN (tCounts = 3 AND Type1 IN 'pizza') AND (Type1 IN 'singleside' OR Type1 IN 'side') then 1
END portalcount

The rest of my query runs but not with the grouped case when statement. I am getting a parse syntax error. Should this be restructured as a nested case when statement? I’ve tried many versions of grouping the parentheses but I haven’t had success.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

In SQL IN needs a list of elements, so you need parenthesis around the elements

CASE
WHEN (tCounts = 3 AND Type1 IN ('pizza')) AND (Type1 NOT IN ('singleside','side') ) then 2
WHEN (tCounts = 3 AND Type1 IN ('pizza')) AND (Type1 IN ('singleside','side') ) then 1
END portalcount
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading