Detect language in pandas column in python

Advertisements

I would like to detect language in pandas column in python. After detecting it I want to write the language code as a column in pandas dataframe. Below is my code and what I tried. But I got an error please help.

Thank you.

  data = {'text':  ["It is a good option","Better to have this way","es un portal informático 
  para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}
  # Create DataFrame
  df = pd.DataFrame(data)
  #get the language
 
  for i in  df['text']:
  # Language Detection
  df['lang'] = TextBlob(i)

>Solution :

You can use langdetect library in Python for language detection.

*pip install langdetect*

import pandas as pd
from langdetect import detect

data = {'text':  ["It is a good option","Better to have this way","es un portal informático para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}

df = pd.DataFrame(data)

df['lang'] = df['text'].apply(lambda x: detect(x))

Leave a Reply Cancel reply