Data frame has 1,050,000 rows.
Input: (a pandas dataframe column)
UserImage
https://play-lh.googleusercontent.com/a/AItbvmkI4RoZOTFftgRqwJ0QVl-OqLw0PXFRQsQmzPwayQ=mo
https://play-lh.googleusercontent.com/EGemoI2NTXmTsBVtJqk8jxF9rh8ApRWfsIMQSt2uE4OcpQqbFu7f7NbTK05lx80nuSijCz7sc3a277R67g
https://play-lh.googleusercontent.com/a-/AFdZucpr-V6JJAWHdTjxYVPa15fmQC7pWl5Xd5StFt1E'
Output:
UserIDs
AItbvmkI4RoZOTFftgRqwJ0QVl-OqLw0PXFRQsQmzPwayQ
EGemoI2NTXmTsBVtJqk8jxF9rh8ApRWfsIMQSt2uE4OcpQqbFu7f7NbTK05lx80nuSijCz7sc3a277R67g
AFdZucpr-V6JJAWHdTjxYVPa15fmQC7pWl5Xd5StFt1E
>Solution :
This looks like a perfect use case for a regex:
df['UserIDs'] = df['UserImage'].str.extract('^.*/([^/=]+)[^/]*$')
Or if you want to keep only alphanum + -:
df['UserIDs'] = df['UserImage'].str.extract('^.*/([-\w]+)[^/]*$')
output:
UserImage \
0 https://play-lh.googleusercontent.com/a/AItbvm...
1 https://play-lh.googleusercontent.com/EGemoI2N...
2 https://play-lh.googleusercontent.com/a-/AFdZu...
UserIDs
0 AItbvmkI4RoZOTFftgRqwJ0QVl-OqLw0PXFRQsQmzPwayQ
1 EGemoI2NTXmTsBVtJqk8jxF9rh8ApRWfsIMQSt2uE4OcpQ...
2 AFdZucpr-V6JJAWHdTjxYVPa15fmQC7pWl5Xd5StFt1E