Thanks for any help…I’m hopeful this is pretty simple for someone out there and I’ve explained it well enough.
I have two data frames represent point in time views of some data. They will be generated regularly, and will always have the same structure. I’ve been able to scrub them down to the two columns I care about – SECURITY and CLASS.
Ultimately, I am trying to generate a list of SECURITIES for which the CLASS is different in Dataframe 2 than in Dataframe 1 (Dataframe 1 was generated the prior day, Dataframe 2, the current day). The list of SECURITIES will never change.
For example:
Dataframe 1
| SECURITY | CLASS |
|---|---|
| Sec 1 | Stock |
| Sec 2 | Stock |
| Sec 3 | Bond |
| Sec 4 | Stock |
Datframe 2
| SECURITY | CLASS |
|---|---|
| Sec 1 | Stock |
| Sec 2 | Preferred |
| Sec 3 | Futures Option |
| Sec 4 | Stock |
And I’m trying to get to something like below, in a csv or text file:
- Sec 2 has been modified from "Stock" to "Preferred"
- Sec 3 has been modified from "Bond" to "Futures Option"
I’d be happy to post my most recent attempts, but honestly my python is embarrassing, but if that’s how it’s done here I can do so. I’ve been on and off this site for the past 10-12 hours but just can’t seem to get it working the way I believe it can/should.
Thank you for your time and any help.
>Solution :
Try:
x = df1.merge(df2, on="SECURITY")
for s, x, y in zip(x.SECURITY, x.CLASS_x, x.CLASS_y):
if x != y:
print(f'{s} has been modified from "{x}" to "{y}"')
Prints:
Sec 2 has been modified from "Stock" to "Preferred"
Sec 3 has been modified from "Bond" to "Futures Option"