Follow

Follow

Contact

Home frequency of string (comma separated) in Python

Questions

frequency of string (comma separated) in Python

byMR

January 28, 2022

I’m trying to find the frequency of strings from the field "Select Investors" on this website https://www.cbinsights.com/research-unicorn-companies

Is there a way to pull out the frequency of each of the comma separated strings?

For example, how frequent does the term "Sequoia Capital China" show up?

>Solution :

# Extract data
url = "https://www.cbinsights.com/research-unicorn-companies"
df = pd.read_html(url)
first_df = df[0]

all_investor = []
for i in first_df[column]:
    all_investor += str(i).lower().split(',')

# Calculate frequency
for string in all_investor:
    string = string.strip()
    column = "Select Investors"
    frequency = first_df[column].apply(
        lambda x: string in str(x).lower()).sum()
    print(string, frequency)

Output:
andreessen horowitz 41
new enterprise associates 21
battery ventures 14
index ventures 30
dst global 19
ribbit capital 8
forerunner ventures 4
crosslink capital 4
homebrew 2
sequoia capital 115
thoma bravo 3
softbank 50
tencent holdings 28
lightspeed india partners 4
sequoia capital india 25
ggv capital 14
....

word-cloud

byMR

Published January 28, 2022

Add a comment

Leave a ReplyCancel reply

Read more

Questions

Where does ggplot set the order of the color scheme?

byMR

January 28, 2022

Questions

Higher/Lower Game: ValueError

byMR

January 28, 2022

Questions

Respond from method in controller without req, res from Express

byMR

January 28, 2022

Questions

How to remove a value that contains comma in array

byMR

January 28, 2022

Questions

how to do history.push for multiple parameter with multiple value (array) in react

byMR

January 28, 2022

Questions

R Class turning to NULL instead of date

byMR

January 28, 2022