I have a csv file with something like this column
| Comulative |
|---|
| -1 |
| -3 |
| -4 |
| 1 |
| 2 |
| 5 |
| -1 |
| -4 |
| -8 |
| 1 |
| 3 |
| 5 |
| 10 |
I would like to add an internal column counting the number of sign shifts
To get something like this
| Comulative | Score |
|---|---|
| -1 | 1 |
| -3 | 2 |
| -4 | 3 |
| 1 | 1 |
| 2 | 2 |
| 5 | 3 |
| -1 | 1 |
| -4 | 2 |
| -8 | 3 |
| 1 | 1 |
| 3 | 2 |
| 5 | 3 |
| 10 | 4 |
In my original csv file, the Comulative column usually does not change the sign from about 100 to 500 lines here , for clarity , it changes so often !
Can you tell me how to do it better ?
>Solution :
Get the sign with numpy.sign, then use a custom groupby with cumcount:
# get sign
s = np.sign(df['Comulative'])
# group by consecutive signs
group = s.ne(s.shift()).cumsum()
# enumerate
df['Score'] = s.groupby(group).cumcount().add(1)
NB. if you want to consider 0 as part of the positive numbers, use s = df['Comulative'].gt(0).
output:
Comulative Score
0 -1 1
1 -3 2
2 -4 3
3 1 1
4 2 2
5 5 3
6 -1 1
7 -4 2
8 -8 3
9 1 1
10 3 2
11 5 3
12 10 4