Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Processing words in a file.

by brtch (Initiate)
on Feb 12, 2013 at 05:09 UTC ( #1018295=perlquestion: print w/ replies, xml ) Need Help??
brtch has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have some file with lines in this format: xxx yy yy:yy:yy xxx/xxxxx xxx xxxx xxx xxx where x is an alphabet, y is a number. CATCH: The only thing that's standard across the file are the spaces and the / charecter. i.e, the first word can be of 3 or 4 or 5 charecters long. Can you help me find a perl code that takes each of the strings and compares with strings (i have),i.e, say if the first string in the line == Feb.

Comment on Processing words in a file.
Re: Processing words in a file.
by vinoth.ree (Parson) on Feb 12, 2013 at 05:39 UTC

    I guess you need first word of each line from the file

    use strict; use warnings; open FH, '<', "filename.txt" or die "Can not open file $!"; while(<FH>) { my ($first_word) = $_ =~ /^(\w+)/; if ($first_word eq 'your string') { #Your wish. } }
Re: Processing words in a file.
by frozenwithjoy (Curate) on Feb 12, 2013 at 07:10 UTC
    Are you interested in checking all strings or just the first string? Could you include a few lines from an actual file and indicate your desired output? Thanks!
      Intrested in all strings....
        Also, I need a condition like this: If, $first_word == xyz, followed by if $second_word == abc as the lines have same format, I want the loop to be on a per line basis. ---LOOP SHOULD BE ON PER LINE BASIS
Re: Processing words in a file.
by kcott (Abbot) on Feb 12, 2013 at 07:24 UTC

    G'day brtch,

    Welcome to the monastery.

    Providing a little more context to your question would have been preferable. I'm assuming the first three fields are: month day hours:minutes:seconds. I'll leave you to extrapolate from there.

    Perl has different operators for string and numerical comparisons. '==' is the numerical equality operator; 'eq' is for strings. See perlop for all the different operators; perlop - Equality Operators specifically discusses '==' and 'eq'.

    How you go about breaking up your line for comparison will depend on how much detail you want (e.g. do you want to look at 'yy:yy:yy' as a whole or are you interested in the subfields). I see two main options you might pursue: using the split function or using a regular expression.

    Using split can be as simple as:

    $ perl -Mstrict -Mwarnings -E ' my $line = q{xxx yy yy:yy:yy xxx/xxxxx xxx xxxx xxx xxx}; my @fields = split / / => $line; say $fields[2]; ' yy:yy:yy

    The problem with this level of simplicity is when further down your code you hit $fields[7] and have to backtrack to determine which field index 7 refers to. Ways around this include giving symbolic names to the indices or capturing each field into a meaningfully named variable:

    $ perl -Mstrict -Mwarnings -E ' use constant { MONTH => 0, DAY => 1, TIME => 2, }; my $line = q{xxx yy yy:yy:yy xxx/xxxxx xxx xxxx xxx xxx}; my @fields = split / / => $line; say $fields[TIME]; ' yy:yy:yy
    $ perl -Mstrict -Mwarnings -E ' my $line = q{xxx yy yy:yy:yy xxx/xxxxx xxx xxxx xxx xxx}; my ($month, $day, $time, $rest) = split / / => $line; say $time; ' yy:yy:yy

    If you want to get at the subfields, then a regular expression solution might be better:

    $ perl -Mstrict -Mwarnings -E ' my $line = q{xxx 1 12:34:56 xxx/xxxxx xxx xxxx xxx xxx}; my $line_re = qr{^(\w+) (\d+) (\d+):(\d+):(\d+) (.*)}; my ($month, $day, $hour, $min, $sec, $rest) = $line =~ m{$line_re}; say $hour; ' 12

    All of those parts in parentheses are called Capture Groups. The link I've provided discusses these (as well as Named Capture Groups which I'll leave you to research if you're interested).

    -- Ken

      Thanks Monks, The issue got resolved.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1018295]
Approved by vinoth.ree
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2014-07-31 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (244 votes), past polls