Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Naive Bayes Classifier Using Laplacian Smoothing

by BrowserUk (Pope)
on Nov 02, 2011 at 19:49 UTC ( #935483=note: print w/ replies, xml ) Need Help??


in reply to Naive Bayes Classifier Using Laplacian Smoothing

Let me know what you guys think.

The first thing that leapt of the page for me was:

sub size(){ # setSize() # Returns the number of unique elements in bag. my $self = shift; my %dictionary = %{$self->{DICTIONARY}}; # remove empty string key created by concat operation #delete $self->{DICTIONARY}{''}; return scalar keys (%dictionary); }

Copying an entire hash from a reference into a local hash just to return how many keys it has is insanely wasteful. Especially as you are calling this over and over again in a sort comparator.

Coded as:

sub size { # setSize() # Returns the number of unique elements in bag. return scalar keys %{ shift() }; }

Will speed up your sort by orders of magnitude.

Also, you should not be using -- and would be getting a messages telling you so if you were using strict & warnings -- an empty prototype sub size() { on a subroutine that takes parameters.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Naive Bayes Classifier Using Laplacian Smoothing
Select or Download Code
Re^2: Naive Bayes Classifier Using Laplacian Smoothing
by talwyn (Monk) on Nov 02, 2011 at 21:02 UTC

    As I said this is a rough see-how /if it works implementation. That said. I do appreciate the input I was having lots of trouble with references. Making a 'real' hash was only thing that was working for me.

    I'll see if I can get your revision to work for me. I did not notice any hit at N < = 200 words in the dictionary so its probably something I'll run into when I implement persistence and actually save the learned data to become the next round of training data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935483]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (11)
As of 2014-10-21 06:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (97 votes), past polls