Monday Morning Math: Naive Bayes

by

Good morning! Last week I talked about Bayes’ theorem, which is a way of using the probability of B (assuming that you already know A) to find the probability of A (assuming that you already know B). As an example, you can use the probability that a person with a disease gets a positive test for that disease to find the probability that a person with a positive test actually has the disease, and (still and always surprising to me) those are not the same.

It turns out that Bayes theorem can also be used to determine if an email is spam! Here’s how it works. The email in question is made up of a bunch of words, and matters order the. But for this process, all the words are treated as independent, just a bunch of words in a pile — this is what is behind the word “naive”. The Naive Bayes algorithm looks at this bunch of words, figures out the probability that a piece of spam has those words in it, and then uses Bayes’ theorem to turn that around and find the probability that an email with those words is spam! The math involved is about one step more complicated than Bayes’ theorem, maybe two (something called Laplace smoothing plays a role), but it’s still the same basic idea of flipping probabilities around, in a modern application!

Thanks, S, for sharing this with me!
Sources: “Speech and Language Processing” by Daniel Jurafsky and James H. Martin, as explained by S.

Leave a comment