Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^3: arabic alphabet ... how to deal with?

by derby (Abbot)
on Feb 12, 2009 at 16:19 UTC ( #743383=note: print w/ replies, xml ) Need Help??


in reply to Re^2: arabic alphabet ... how to deal with?
in thread arabic alphabet ... how to deal with?

What about the infile? That needs to be opened UTF-8 as well.

-derby


Comment on Re^3: arabic alphabet ... how to deal with?
Re^4: arabic alphabet ... how to deal with?
by Anonymous Monk on Feb 12, 2009 at 16:37 UTC
    It is uft8 as well ... It does not change the results but also it gives some errors as well regarding the wide character ... even i change my code to this:
    #!/usr/bin/perl use Lingua::AR::Word::Encode; use Encode::Arabic; open (STOPWORDS, $ARGV[1]) || die "Error opening the stopwords file\n" +; $count = 0; while ($word = <STOPWORDS>) { $word=Lingua::AR::Word::encode($word); chop($word); $stopword[$count] = lc($word); $count++; } close(STOPWORDS); open (INFILE , $ARGV[0]) || die "Error opening the input file\n"; while ($line = <INFILE>) { $line=Lingua::AR::Word::encode($line); chop($line); @entry = split(/ /, $line); $i = 0; while ($entry[$i]) { $found = 0; $j = 0; while (($j<=$count) && ($found==0)) { if (lc($entry[$i]) eq $stopword[$j]) { $found = 1; } $j++; } if ($found == 0) { print "$entry[$i]\n "; } $i++; } } close(INFILE);
    still does not work , it does not remove my stop words :(

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://743383]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2014-07-28 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (186 votes), past polls