Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: Naive Bayes Classifier Using Laplacian Smoothing

by BrowserUk (Pope)
on Nov 02, 2011 at 19:49 UTC ( #935483=note: print w/replies, xml ) Need Help??

in reply to Naive Bayes Classifier Using Laplacian Smoothing

Let me know what you guys think.

The first thing that leapt of the page for me was:

sub size(){ # setSize() # Returns the number of unique elements in bag. my $self = shift; my %dictionary = %{$self->{DICTIONARY}}; # remove empty string key created by concat operation #delete $self->{DICTIONARY}{''}; return scalar keys (%dictionary); }

Copying an entire hash from a reference into a local hash just to return how many keys it has is insanely wasteful. Especially as you are calling this over and over again in a sort comparator.

Coded as:

sub size { # setSize() # Returns the number of unique elements in bag. return scalar keys %{ shift() }; }

Will speed up your sort by orders of magnitude.

Also, you should not be using -- and would be getting a messages telling you so if you were using strict & warnings -- an empty prototype sub size() { on a subroutine that takes parameters.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Naive Bayes Classifier Using Laplacian Smoothing
by talwyn (Monk) on Nov 02, 2011 at 21:02 UTC

    As I said this is a rough see-how /if it works implementation. That said. I do appreciate the input I was having lots of trouble with references. Making a 'real' hash was only thing that was working for me.

    I'll see if I can get your revision to work for me. I did not notice any hit at N < = 200 words in the dictionary so its probably something I'll run into when I implement persistence and actually save the learned data to become the next round of training data.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935483]
[ambrus]: Corion: some of these stupid syntax highlighters assume that too. just look at the table in http://perldoc. functions/pack. html for example.
[haukex]: ..."yet" ;-) I haven't had to deal with Dist::Zilla yet but I've heard about how it's a big setup
[ambrus]: I really don't like automagic stuff. I'm happy when computers do exactly what I tell them, even if that means they sometimes do the wrong thing.
[ambrus]: And I don't much like syntax highlighters. If you need a syntax highlighter to understand your code, then your code is written unclear.
[ambrus]: And if you need a syntax highlighter to color parenthesis green and numbers black and letters blue, then you're using the wrong font.
[ambrus]: I have to tolerate syntax highlighters when other people use them, but I don't use them myself. And sorry for the rant.

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (11)
As of 2017-02-27 12:43 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (385 votes). Check out past polls.