Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Reading Files

by perlguru22 (Acolyte)
on Nov 02, 2012 at 05:22 UTC ( #1001912=perlquestion: print w/replies, xml ) Need Help??
perlguru22 has asked for the wisdom of the Perl Monks concerning the following question:

I have this assignment we have to create a simple spell checker we have 1 file called words.txt that contains like 4,000 words and than we have another file called text.txt that contains just random words some are mispelled this is my program.
open INFILE, "words.txt" or die "can't open file $!"; while ($word = <INFILE>) { chomp($word); $dict{$word}=1; } while ($word = <>) { chomp($word); @words=split//,$word; if(!exists $dict{$word}) { print "$word is mispelled\n"; } }
The problem that I am having is that my text.txt file contains something like this. Ex more cat lose pat red persan when I run the program it prints them together like more cat lose is mispelled instead of breaking them apart. Thanks, any help or advice would be useful =)

Replies are listed 'Best First'.
Re: Reading Files
by Kenosis (Priest) on Nov 02, 2012 at 06:11 UTC

    You've received excellent suggestions on how to split on whitespace to get the words out of lines.

    The following is more than requested, but perhaps it will assist your scripting:

    • Always begin your scripts with use strict; use warnings;
    • Use lexically-scoped variables (my), including as file handles
    • Use the three-argument form of open
    • You can omit the parentheses when using chomp in your script (but not in all scripts)
    • The second chomp isn't necessary, since you'll be splitting on whitespace
    • You need to enclose your if(!exists $dict{$word}) ... within a for loop that iterates through the words in @words

    Given the above, consider the following (it's been run through perltidy):

    use strict; use warnings; my %dict; open my $infile, '<', 'words.txt' or die "can't open file $!"; while ( my $word = <$infile> ) { chomp $word; $dict{$word} = 1; } close $infile; while ( my $line = <> ) { my @words = split /\s+/, $line; for my $word (@words) { if ( !exists $dict{$word} ) { print "$word is mispelled\n"; } } }

    I hope this is helpful.

Re: Reading Files
by kielstirling (Scribe) on Nov 02, 2012 at 05:46 UTC
    Hi, add \s+ to your regex to split on a space
    perl -MData::Dumper -e 'print Dumper split /\s+/, shift' "do cat fr"

        Yes, like that if your input is split's default arugment $_.
        Or in your case:
        my @words = split/\s+/,$word;

        If you tell me, I'll forget.
        If you show me, I'll remember.
        if you involve me, I'll understand.
        --- Author unknown to me
Dictionary load improvement
by space_monk (Chaplain) on Nov 02, 2012 at 17:36 UTC
    On a slight tangent, a faster way of reading your input may be to read the entire "dictionary" in one go, and then map the entries into the dict array....

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1001912]
Approved by Ratazong
and the universe expands...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2018-01-23 07:07 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (241 votes). Check out past polls.