I have a CSV file of multiple columns. One of the columns has a string of different data types, letters & floats. These are the ProductName and Price eg. Coffee – 2.50, Tea – 3.00, …etc
However, I cannot figure out how to seperate the price(float) from the string (i believe putting it into dictionary format is best? to make {Product(str):Price(float)}
Column example:
"Large Flavoured iced latte – Caramel – 3.25, Regular Flavoured iced latte – Hazelnut – 2.75, Regular Flavoured iced latte – Caramel – 2.75, Large Flavoured iced latte – Hazelnut – 3.25, Regular Flavoured latte – Hazelnut – 2.55, Regular Flavoured iced latte – Hazelnut – 2.75"
I tried:
my_list=[i.split(',') for i in my_list]
print(my_list)
But after this i have a list as so and do not know how to process further the elements
[['Large Flavoured iced latte - Caramel - 3.25', ' Regular Flavoured iced latte - Hazelnut - 2.75', ' Regular Flavoured iced latte - Caramel - 2.75', ' Large Flavoured iced latte - Hazelnut - 3.25', ' Regular Flavoured latte - Hazelnut - 2.55', ' Regular Flavoured iced latte - Hazelnut - 2.75']]
Thank you in advance
>Solution :
Using re.findall here is one approach:
inp = "Large Flavoured iced latte - Caramel - 3.25, Regular Flavoured iced latte - Hazelnut - 2.75, Regular Flavoured iced latte - Caramel - 2.75, Large Flavoured iced latte - Hazelnut - 3.25, Regular Flavoured latte - Hazelnut - 2.55, Regular Flavoured iced latte - Hazelnut - 2.75"
d = dict(re.findall(r'(.*?)\s*-\s*(\d+(?:\.\d+)?),?\s*', inp))
print(d)
This prints:
{'Large Flavoured iced latte - Caramel': '3.25',
'Regular Flavoured iced latte - Hazelnut': '2.75',
'Regular Flavoured latte - Hazelnut': '2.55',
'Regular Flavoured iced latte - Caramel': '2.75',
'Large Flavoured iced latte - Hazelnut': '3.25'}