How to select the scala dataframe column with special character in it?

I am reading a json file where the key is having come special character. E.g [{ "ABB/aws:1.0/CustomerId:2.0": [{ "id": 20, "namehash": "de8cfcde-95c5-47ac-a544-13db50557eaa" }] }] I am creating a scala dataframe and then trying to select the column using spark.sql "ABB/aws:1.0/CustomerId:2.0". Thats when its complaining about special character. dataframe looks like this >Solution : Use backtick… Read More How to select the scala dataframe column with special character in it?

Back-ticks in DataFrame.colRegex?

For PySpark, I find back-ticks enclosing regular expressions for DataFrame.colRegex() here, here, and in this SO question. Here is the example from the DataFrame.colRegex doc string: df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["Col1", "Col2"]) df.select(df.colRegex("`(Col1)?+.+`")).show() +—-+ |Col2| +—-+ | 1| | 2| | 3| +—-+ The answer to the SO question doesn’t show… Read More Back-ticks in DataFrame.colRegex?

map columns of two dataframes based on array intersection of their individual columns and based on highest common element match Pyspark/Pandas

I have a dataframe df1 like this: A B AA [a,b,c,d] BB [a,f,g,c] CC [a,b,l,m] And another one as df2 like: C D XX [a,b,c,n] YY [a,m,r,s] UU [e,h,I,j] I want to find out and map column C of df2 with column A of df1 based on the highest element match between the items of… Read More map columns of two dataframes based on array intersection of their individual columns and based on highest common element match Pyspark/Pandas

RuntimeError: Java gateway process exited before sending its port number after setting JAVA_HOME

I’m trying to start pyspark using VSCode but i am getting the follow errors: Java not found and JAVA_HOME environment variable is not set. Install Java and set JAVA_HOME to point to the Java installation directory. Traceback (most recent call last): File "c:\Users\Erevos\Desktop\Pyspark\LearnSpark.py", line 5, in <module> spark = SparkSession.builder.appName("MyApp").getOrCreate() File "C:\Users\Erevos\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\sql\session.py", line 477, in… Read More RuntimeError: Java gateway process exited before sending its port number after setting JAVA_HOME