DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Bayesian Classifier And Filtering
(Also copied and modified from seb's code <a href=http://sebsauvage.net/python/snyppets/>here</a>)
Reverend is a free Bayesian module for Python. <a href=http://sourceforge.net/project/reverend/>Get it here</a>
Here's how we train the classifier
from reverend.thomas import Bayes
g = Bayes() # guesser
g.train('french','La souris est rentrée dans son trou.')
g.train('english','my tailor is rich.')
g.train('french','Je ne sais pas si je viendrai demain.')
g.train('english','I do not plan to update my website soon.')
Then use it to guess the language
>>> print g.guess('Jumping out of cliffs it not a good idea.')
[('english', 0.99990000000000001), ('french', 9.9999999999988987e-005)]
# 99.99% English
>>> print g.guess('Demain il fera très probablement chaud.')
[('french', 0.99990000000000001), ('english', 9.9999999999988987e-005)]
# 99.99% French
You can train it with more languages. You can also train it to classify the kind of text.





