Context: I have been doing some data transformation to a data frame and I currently have it formated like this:
sample REF ALT #
1 a A G 1.0
2 a C T 4.0
3 a GGG AAC 1.0
5 b A G 1.0
6 b C T 4.0
and I’d like to pivot this data frame to have the data displayed this way:
1 REF A C GGG C
2 ALT G T AAC A
3 a 1 4 1 0
4 b 1 4 0 0
…
I was trying to use pivot
final_data = data_no_na.pivot(
columns=("sample", "REF", "ALT"), values="#").reset_index()
but it is not quite there yet.
How could I do this transformation?
Thank you in advance, any help / link to numpy or pandas documentation is very welcome as I’m fairly new to python.
>Solution :
You could play with the index and unstack
:
(df.set_index(['sample', 'REF', 'ALT'])
['#']
.unstack(['REF', 'ALT'], fill_value=0)
)
Output:
REF A C GGG
ALT G T AAC
sample
a 1.0 4.0 1.0
b 1.0 4.0 0.0