I have a dataframe with 10000 rows that look like below:
import numpy as np
df=pd.DataFrame(np.array([['facebook', '15',"women tennis"], ['facebook', '20',"men basketball"], ['facebook', '30','club'],
['apple', "10","vice president"], ['apple', "100",'swimming contest']]),columns=['firm','id','text'])
I’d like to save each firm into a separate JSON file. So the json file for Facebook looks like below, with the file name written as "firm.json" (e.g. facebook.json). The same will be for other firms, such as Apple.
Sorry, I am still a beginner to Pandas, is there a way to do so effectively?
>Solution :
You can do:
json_cols = df.columns.drop('firm').tolist()
json_records = df.groupby('firm')[json_cols].apply(
lambda x:x.to_json(orient='records'))
Then for ‘facebook’:
facebook_json = json_records['facebook']
'[{"id":"15","text":"women tennis"},
{"id":"20","text":"men basketball"},
{"id":"30","text":"club"}]'
for ‘apple’:
apple_json = json_records['apple']
'[{"id":"10","text":"vice president"},{"id":"100","text":"swimming contest"}]'
Save all at once
for col, records in json_records.iteritems():
with open(f"{col}.json", "w") as file:
file.write(records)

