Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

putting text into array word by word

by jms53 (Monk)
on Jan 09, 2012 at 18:14 UTC ( #947041=perlquestion: print w/ replies, xml ) Need Help??
jms53 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to take an input file and separate it word by word into an array

My current code

open FILE, "+>", "input.txt" or die $!; print "file loaded \n"; while (<FILE>) {@all_words = split('', $_); push @all_words, @words; }

Thank you

Comment on putting text into array word by word
Download Code
Re: putting text into array word by word
by toolic (Chancellor) on Jan 09, 2012 at 18:22 UTC
Re: putting text into array word by word
by roboticus (Canon) on Jan 09, 2012 at 18:22 UTC

    jms53:

    Try:

    # Use read mode instead of write and append mode open FILE, "<", "input.txt" or die $!; print "file loaded \n"; my @all_words; while (<FILE>) { # the words from *this* line my @words = split('', $_); # add to the complete word list push @all_words, @words; }

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Here is my full code:

      #! /usr/bin/perl -w use strict; my $i = 0; my $element; my @words; my @all_words open FILE, "<", "input.txt" or die $!; print "file loaded \n"; while (<FILE>) { # the words from *this* line @words = split('', $_); push @all_words, @words; } print "table loaded \n"; foreach $element (@words) { print "$element"; }

      the "file loaded" and "table loaded" messages appear, however the elements are not printed afterwards

        jms53:

        Try printing from @all_words instead of @words.

        Update: s/allwords/all_words/, added code tags.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

        You also probably want split(' ', $_); (to split on whitespace), not split('', $_);.  Or simply split; (which has the same effect).

        BTW, your </code > tag doesn't work, because you have a space in between the angle bracket and the tag name...

Re: putting text into array word by word
by Marshall (Prior) on Jan 09, 2012 at 18:27 UTC
    In the "open", I would not use the file mode '+>' if you only intend to read the file.

    #!usr/bin/perl -w use strict; open FILE, '<', "input.txt" or die "unable to open input.txt"; # the $! variable adds the O/S specific info, but it is not # not always that useful. my @all_words; while (<FILE>) { my @these_words = split(' ', $_); foreach my $this_word (@these_words) { push @all_words, $this_word; } }
    #to count the words: use Data::Dumper; my %words; while (<FILE>) { my @these_words = split(' ', $_); foreach my $this_word (@these_words) { $words{$this_word}++; } } print Dumper \%words;

      No foreach loop required. You can push several values at once: push @all_words, @these_words;

      I have added  print "$this_word";

      and

       print $this_word;

      in the foreach loop, yet nothing is printed

        Your file "open mode" is wrong, use '<' for read-only.
        #!/usr/bin/perl -w use strict; my @words; open FILE, "<", "input.txt" or die "unable to open input.txt $!"; print "file loaded \n"; while (<FILE>) { @words = split(' ', $_); push @all_words, @words; } foreach $word (@words) { print "$word\n"; } __END__ # if you want to count the words # then that is different - use a hash # table of "word => count" #the default split (/\s+/,$_) #differs only between this special case split(' ',$_) #in how it handles a "null" field at the beginning of the line my %words; while (<FILE>) { my @words = split; foreach my $word (@words) { $words{$word}++; } } === or === my %words; while (<FILE>) { foreach my $word (split) { $words{$word}++; } } ==== or === my %words; while (<FILE>) { $words{$_}++ foreach (split); }
Re: putting text into array word by word
by ansh batra (Friar) on Jan 09, 2012 at 18:29 UTC
    firstly your file loading option is wrong.use < instead of +>
    secondly split function's first parameter should be \s
    use this code
    #! /usr/bin/perl open(FILE,"<input.txt"); print "file loaded \n"; my @lines=<FILE>; close(FILE); my @all_words; foreach my $line(@lines) { @temp_arr=split('\s',$line); push(@all_words,@temp_arr); } print "@all_words\n";

      I have tried your code, and although I get no syntax errors, it does not print the array

      using the following code

      #! /usr/bin/perl -w use strict; open(FILE,"<input.txt"); print "file loaded \n"; my @lines=<FILE>; my @temp_arr; close(FILE); my @all_words; foreach my $line(@lines) { @temp_arr=split('\s',$line); push(@all_words,@temp_arr); } print "@all_words\n";
        check your file from which you are taking the input.
        since you used +> parameter earlier , perl have deleted its contents.
Re: putting text into array word by word
by tobyink (Abbot) on Jan 09, 2012 at 21:23 UTC

    Easy...

    use strict; use autodie; open my $infile, '<', '/home/tai/tmp/sm-error-report.txt'; my @words = split /\W+/, do { local $/ = <$infile> }; print "$_\n" foreach @words;
      Thanks for reminding me on the importance of commenting code.

      (I totally had no difficulty with your regexp)

      But I agree that not only are there more than one ways to do it, there are also some which require more experience ;)

        The only thing I used that could be accused of being slightly obscure is:

        do { local $/ = <$infile> }

        This is a fairly commonly used idiom for reading an entire file into a single string, but exactly how it works is somewhat obscure.

        Firstly, when you call <$filehandle>, Perl reads a single line from the file $filehandle. See perldoc -f readline

        Secondly, the variable $/ is used by Perl's file reading function to indicate what character to use as a line terminator (technically it's called the record separator). So normally, $/ is set to "\n". If you set $/ to undef, then Perl won't treat any characters as line terminators, so the readline function will simply read the entire remainder of the file. See perldoc -f readline and perldoc perlvar

        So, just based on the above knowledge, we can slurp the entire contents of a filehandle into a string like this:

        $/ = undef; my $string = <$filehandle>;

        But actually, what if other parts of our code rely on $/ being set to "\n"? We don't want to permanently undefine it.

        my $old_terminator = $/; $/ = undef; my $string = <$filehandle>; $/ = $old_terminator;

        Because temporarily changing the value of a variable is such a common need, Perl provides a shortcut. The local keyword allows you to set a new temporary value for a variable for a particular code block, such that the variable's old value will be automatically restored at the end of the block. See perldoc -f local. So our code becomes:

        my $string; { local $/ = undef; $string = <$filehandle>; }

        But = undef is redundant because all newly declared scalar variables (including local ones) are undefined. So now we have:

        my $string; { local $/; $string = <$filehandle>; }

        Now, the do block allows Perl to run a block of code and return the result of the last statement. See perlsyn. So our code can become:

        my $string = do { local $/; <$filehandle>; }

        The last simplification relies on the fact that in the following statement:

        local $/ = <$filehandle>

        Perl does things in this order:

        1. Localizes $/, setting it to undef.
        2. Reads the file - the entire file because $/ is undef.
        3. Performs the assignment.

        Thus we end up with the situation where you can read the entire contents of an open file handle into a string using:

        my $string = do { local $/ = <$infile> };

        Now, of course I could have included the entire explanation above as a comment, but I try to stick to a policy of never writing comments which are longer than the code itself.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://947041]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (12)
As of 2014-07-30 06:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls