Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: Naive Bayes Classifier Using Laplacian Smoothing

by BrowserUk (Pope)
on Nov 02, 2011 at 19:49 UTC ( #935483=note: print w/ replies, xml ) Need Help??

in reply to Naive Bayes Classifier Using Laplacian Smoothing

Let me know what you guys think.

The first thing that leapt of the page for me was:

sub size(){ # setSize() # Returns the number of unique elements in bag. my $self = shift; my %dictionary = %{$self->{DICTIONARY}}; # remove empty string key created by concat operation #delete $self->{DICTIONARY}{''}; return scalar keys (%dictionary); }

Copying an entire hash from a reference into a local hash just to return how many keys it has is insanely wasteful. Especially as you are calling this over and over again in a sort comparator.

Coded as:

sub size { # setSize() # Returns the number of unique elements in bag. return scalar keys %{ shift() }; }

Will speed up your sort by orders of magnitude.

Also, you should not be using -- and would be getting a messages telling you so if you were using strict & warnings -- an empty prototype sub size() { on a subroutine that takes parameters.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: Naive Bayes Classifier Using Laplacian Smoothing
Select or Download Code
Replies are listed 'Best First'.
Re^2: Naive Bayes Classifier Using Laplacian Smoothing
by talwyn (Monk) on Nov 02, 2011 at 21:02 UTC

    As I said this is a rough see-how /if it works implementation. That said. I do appreciate the input I was having lots of trouble with references. Making a 'real' hash was only thing that was working for me.

    I'll see if I can get your revision to work for me. I did not notice any hit at N < = 200 words in the dictionary so its probably something I'll run into when I implement persistence and actually save the learned data to become the next round of training data.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935483]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (18)
As of 2015-11-24 22:32 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (664 votes), past polls