so I wanted to create a loop in order to generate 1000 results of a t-test from random sampling from two different populations.
My loop does basically what it is required, the only issue is that I would like to append the result of the print, to a dataframe.
results = pd.DataFrame({'Effect Size':[], 'p-value':[]})
for i in range(1000):
sample1 = np.random.normal(0,1,1000)
sample2 = np.random.normal(.05,1,1000)
effect_size, pvalue = stats.ttest_ind(a=sample1, b=sample2, equal_var=True)
results = pd.DataFrame(print(effect_size,pvalue))
results.head()
The output I get however is this one:
-1.6143890836641985 0.10660095803269495
-2.0260421693695845 0.0428931041087038
-2.7052945035320413 0.006882349977869199
-0.650014611610562 0.5157575104187226
0.35589181647004076 0.721959156357101
-1.8580323211600547 0.0633114210246122
-2.1346234965598185 0.03291315538511747
-1.5619392256304192 0.11846067349115201
-1.4286159705357937 0.15327094637955832
-2.5338588520198324 0.011357254651096133
-1.125224663298795 0.2606289939128222
-1.8130036805024503 0.06998125666628215
-0.0350581349501468 0.9720368863172242
-0.14942653694599559 0.881232154213759
-1.3726021387765257 0.17003011697766837
-0.391077951258786 0.6957813156125576
-1.8118048538852072 0.07016643231973188
_
My desired output is to attach those 2 values in 2 separate columns on the dataframe I created above. Any solutions?
>Solution :
this should work, using loc function and getting rid of print
results = pd.DataFrame({'Effect Size':[], 'p-value':[]})
for i in range(1000):
sample1 = np.random.normal(0,1,1000)
sample2 = np.random.normal(.05,1,1000)
effect_size, pvalue = stats.ttest_ind(a=sample1, b=sample2, equal_var=True)
results.loc[i,:] = [effect_size,pvalue]