Two Letter Bigram in Pandas Dataframe

Having trouble finding a way to get every two letter combination in a string in a dataframe. Everything I have been searching is for words rather than letters. Below is expected output. stringoutputhellohe, el, ll, loworldwo, or, rl, I have tried both lines below df[‘bigram’] = list(zip(df[‘string’],df[‘string][1:])) Generated this error ValueError: Length of values (15570)… Read More Two Letter Bigram in Pandas Dataframe

Pandas: astype(int) converts all values to -2147483648

I am converting large numbers to int using astype(). import pandas as pd x = [56668772800.0, 55899926600.0, 55038007900.0, 58073681300.0, 58224458500.0] df = pd.DataFrame(x, columns=["pol_number"]) df.pol_number = df.pol_number.astype(int) df And all five values have become -2147483648. I assume the numbers are too big for type int but long int and big int don’t compile. >Solution :… Read More Pandas: astype(int) converts all values to -2147483648

How to find if all the columns in a dataframe are object dtype?

Let’s say I have a DataFrame as data, and I want to find if every single column in the data frame is an object and use it as an if condition. example: describe = data.describe (if condition to find all the columns are ‘object’): agg = data.agg([‘a’,’b’,’c’]) if not agg.empty: describe = pd.concat(describe,agg) describe =… Read More How to find if all the columns in a dataframe are object dtype?

Flag element in groupby if it is equal to his sucessor in next row

I have the following df df = pd.DataFrame( {‘id’:[1,1,1,2,2,2,3,3,3], ‘value’:[‘pot’,’pot’,’jebus’,’pot’,’jebus’,’pot’,’pot’,’jebus’,’jebus’]}) What I want to do is to identify if an id contains repetitive values but only if a row is followed by another row with the same value. So if I have ‘pot’ and after that ‘pot’ again, I want to flag both as true.… Read More Flag element in groupby if it is equal to his sucessor in next row

Correctly save list to CSV file

Using the Selenium library, I get some information (a table from the site) and try to save this table to a CSV file. elements_xpach_url = browser.find_element(By.XPATH, ‘/html/body/div[2]/span[2]/table/tbody[2]’) values = re.split(‘\n’, elements_xpach_url.text) df = pd.DataFrame(values) df.to_csv(‘csv.csv’, index=False, header=False, sep=’;’) In the final file, there is no way to set the division as a semicolon ; instead… Read More Correctly save list to CSV file