Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Naive Bayes Classifier Using Laplacian Smoothing

by BrowserUk (Pope)
on Nov 02, 2011 at 19:49 UTC ( #935483=note: print w/ replies, xml ) Need Help??


in reply to Naive Bayes Classifier Using Laplacian Smoothing

Let me know what you guys think.

The first thing that leapt of the page for me was:

sub size(){ # setSize() # Returns the number of unique elements in bag. my $self = shift; my %dictionary = %{$self->{DICTIONARY}}; # remove empty string key created by concat operation #delete $self->{DICTIONARY}{''}; return scalar keys (%dictionary); }

Copying an entire hash from a reference into a local hash just to return how many keys it has is insanely wasteful. Especially as you are calling this over and over again in a sort comparator.

Coded as:

sub size { # setSize() # Returns the number of unique elements in bag. return scalar keys %{ shift() }; }

Will speed up your sort by orders of magnitude.

Also, you should not be using -- and would be getting a messages telling you so if you were using strict & warnings -- an empty prototype sub size() { on a subroutine that takes parameters.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Naive Bayes Classifier Using Laplacian Smoothing
Select or Download Code
Replies are listed 'Best First'.
Re^2: Naive Bayes Classifier Using Laplacian Smoothing
by talwyn (Monk) on Nov 02, 2011 at 21:02 UTC

    As I said this is a rough see-how /if it works implementation. That said. I do appreciate the input I was having lots of trouble with references. Making a 'real' hash was only thing that was working for me.

    I'll see if I can get your revision to work for me. I did not notice any hit at N < = 200 words in the dictionary so its probably something I'll run into when I implement persistence and actually save the learned data to become the next round of training data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935483]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (14)
As of 2015-07-28 09:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (254 votes), past polls