DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Bayesian Classifier And Filtering

09.14.2005
| 4092 views |
  • submit to reddit
        (Also copied and modified from seb's code <a href=http://sebsauvage.net/python/snyppets/>here</a>)
Reverend is a free Bayesian module for Python. <a href=http://sourceforge.net/project/reverend/>Get it here</a>
Here's how we train the classifier
from reverend.thomas import Bayes
g = Bayes()    # guesser
g.train('french','La souris est rentrée dans son trou.')
g.train('english','my tailor is rich.')
g.train('french','Je ne sais pas si je viendrai demain.')
g.train('english','I do not plan to update my website soon.')
Then use it to guess the language
>>> print g.guess('Jumping out of cliffs it not a good idea.')
[('english', 0.99990000000000001), ('french', 9.9999999999988987e-005)]
# 99.99% English

>>> print g.guess('Demain il fera très probablement chaud.')
[('french', 0.99990000000000001), ('english', 9.9999999999988987e-005)]
# 99.99% French
You can train it with more languages. You can also train it
to classify the kind of text.