Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Naive Bayes Classifier Using Laplacian Smoothing

by BrowserUk (Pope)
on Nov 02, 2011 at 19:49 UTC ( #935483=note: print w/replies, xml ) Need Help??

in reply to Naive Bayes Classifier Using Laplacian Smoothing

Let me know what you guys think.

The first thing that leapt of the page for me was:

sub size(){ # setSize() # Returns the number of unique elements in bag. my $self = shift; my %dictionary = %{$self->{DICTIONARY}}; # remove empty string key created by concat operation #delete $self->{DICTIONARY}{''}; return scalar keys (%dictionary); }

Copying an entire hash from a reference into a local hash just to return how many keys it has is insanely wasteful. Especially as you are calling this over and over again in a sort comparator.

Coded as:

sub size { # setSize() # Returns the number of unique elements in bag. return scalar keys %{ shift() }; }

Will speed up your sort by orders of magnitude.

Also, you should not be using -- and would be getting a messages telling you so if you were using strict & warnings -- an empty prototype sub size() { on a subroutine that takes parameters.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Naive Bayes Classifier Using Laplacian Smoothing
by talwyn (Monk) on Nov 02, 2011 at 21:02 UTC

    As I said this is a rough see-how /if it works implementation. That said. I do appreciate the input I was having lots of trouble with references. Making a 'real' hash was only thing that was working for me.

    I'll see if I can get your revision to work for me. I did not notice any hit at N < = 200 words in the dictionary so its probably something I'll run into when I implement persistence and actually save the learned data to become the next round of training data.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935483]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2017-12-17 10:52 GMT
Find Nodes?
    Voting Booth?
    What programming language do you hate the most?

    Results (463 votes). Check out past polls.