Change columns values to rows (pivot data frame)

Context: I have been doing some data transformation to a data frame and I currently have it formated like this:

            sample       REF   ALT   #
1                a         A    G   1.0
2                a         C    T   4.0
3                a       GGG  AAC   1.0
5                b         A    G   1.0
6                b         C    T   4.0

and I’d like to pivot this data frame to have the data displayed this way:

1          REF       A    C    GGG    C
2          ALT       G    T    AAC    A
3           a        1    4     1     0
4           b        1    4     0     0

I was trying to use pivot

final_data = data_no_na.pivot(
columns=("sample", "REF", "ALT"), values="#").reset_index()

but it is not quite there yet.

How could I do this transformation?

Thank you in advance, any help / link to numpy or pandas documentation is very welcome as I’m fairly new to python.

>Solution :

You could play with the index and unstack:

(df.set_index(['sample', 'REF', 'ALT'])
   ['#']
   .unstack(['REF', 'ALT'], fill_value=0)
 )

Output:

REF       A    C  GGG
ALT       G    T  AAC
sample               
a       1.0  4.0  1.0
b       1.0  4.0  0.0

Leave a Reply