I’d like to create a dataframe in Spark with Scala code like this:
| col_1 | col_2 | col_3 | .. | col_2048 |
|---|---|---|---|---|
| 0.123 | 0.234 | … | … | 0.323 |
| 0.345 | 0.456 | … | … | 0.534 |
Then add an extra column of ArrayType to it, that holds all these 2048 columns data in one column:
| col_1 | col_2 | col_3 | .. | col_2048 | array_col |
|---|---|---|---|---|---|
| 0.123 | 0.234 | … | … | 0.323 | [0,123, 0.234, …, 0.323] |
| 0.345 | 0.456 | … | … | 0.534 | [0.345, 0.456, …, 0.534] |
>Solution :
try this
df.withColumn("array_col",array(df.columns.map(col): _*)).show