Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Detect language in pandas column in python

I would like to detect language in pandas column in python. After detecting it I want to write the language code as a column in pandas dataframe. Below is my code and what I tried. But I got an error please help.

Thank you.

  data = {'text':  ["It is a good option","Better to have this way","es un portal informático 
  para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}
  # Create DataFrame
  df = pd.DataFrame(data)
  #get the language
 
  for i in  df['text']:
  # Language Detection
  df['lang'] = TextBlob(i)

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use langdetect library in Python for language detection.

*pip install langdetect*

import pandas as pd
from langdetect import detect

data = {'text':  ["It is a good option","Better to have this way","es un portal informático para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}

df = pd.DataFrame(data)

df['lang'] = df['text'].apply(lambda x: detect(x))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading