Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

count words without point and comma in python

I would like to count each word in a text file in python.

My text is this

Aragon is an autonomous community in northeastern Spain. The capital of Aragon is Zaragoza, which is also the most populous city in the autonomous community. Covering an area of ​​47720 km2, the region's terrain ranges from permanent glaciers through verdant valleys, rich pastures and orchards to the arid steppe plains of the central lowlands. Aragon is home to many rivers, most notably the Ebro, Spain's largest river, which flows west to east throughout the region through the province of Zaragoza. It is also home to the highest mountains in the Pyrenees.

I gave the following code

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

file=open("data/aragon.txt",'r')
from collections import Counter
wordcount = Counter(file.read().split())
for item in wordcount.items(): print("{}\t{}".format(*item))

But the problem is, it doesnt come in an order. I would like to have that the highest is at the top and lowest on the other side and don’t have any words like this: Ebro, or Spain. no point or comma just word

How can I fix that?

>Solution :

Maybe you can use regex and match words

from collections import Counter
import re
wordcount = Counter(re.findall('\w+', file.read()))
for item in wordcount.most_common(): print("{}\t{}".format(*item))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading