java - Understanding algorithm - Multinomial Naive Bayes -

- April 15, 2014

i have been introduced naive bayes classification method (multinomial nb), reference how described michael sipser in book "the theory of computation".

i looking @ algorithm described both training , applying multinomial nb, presented follows:

however, i'm coming loss when interpreting aspects of algorithm. instance, in trainmultinomialnb(c, d) on line 6:

what concatenate_text_of_all_docs_in_class(d, c) do?

so far, understand follows. suppose have 3 - 3 - documents in class "movies" , "songs":

movies     doc1 = "big fish"     doc2 = "big lebowski"     doc3 = "mystic river"  songs     doc1 = "purple rain"     doc2 = "crying in rain"     doc3 = "anaconda"

after applying concatenate_text_of_all_docs_in_class(d, c), left with, strings:

string concatenatedmovies = "big fish big lebowski mystic river"  string concatenatedsongs = "purple rain crying in rain anaconda"

is right? understand highly appreciated.

in end, want able clasify text based on content. want able if songs or movies, etc.
in order bayes (or other method), first use train data build model.

first, creating priors (docs in class / total docs) on line 5. compute conditional probabilities (probability of word fish given class movies, probability of word rain given class songs), lines 7-10. divide occurences of term total number of terms in class (plus smoothing -> +1). why concatinate - able count occurences of term in class.
in end, plug these values in bayes formula , can categorize unknonw document movies, songs, ... more wiki

Search This Blog

Addrety

java - Understanding algorithm - Multinomial Naive Bayes -

Comments

Post a Comment

Popular posts from this blog

javascript - Feed FileReader from server side files -

oracle - pls-00402 alias required in select list of cursor to avoid duplicate column names -

php - Webix Data Loading from Laravel Link -