Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Section of $string into new array

by GeorgMN (Acolyte)
on Mar 17, 2014 at 17:18 UTC ( #1078633=perlquestion: print w/replies, xml ) Need Help??
GeorgMN has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks, I need to seek your help again. What i am trying to accomplish is to read through a text file and grab values from strings (values from a certain, non-fixed line position in a file with non-fixed index position. What i would like to accomplish is the end up with a new array containing the values i am trying to match. I seem to be unable to break the strings into pieces in either @ or $ format. Can you please help me? Thank you in advance!

#!/usr/bin/perl -w use warnings; use strict; #use diagnostics; use Data::Dumper qw (Dumper); my $inputfile = $ARGV[0]; my @filedata; my $pattern = /^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4 +][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25 +[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\/[0-9]?[0-9]$/; my @temps; my $counter = 0; open (INPUT, "<$inputfile") or die "no file for proccessing provided o +r existing"; while (<INPUT>) { push (@filedata, $_); } foreach (@filedata) { if ($_ =~ /$pattern/) { $counter++; #my $temp = (split ' ', $_); push (@temps, (split /\s/,$_)); } } close INPUT; #print "@temps\n"; # foreach (@temps) { if ($_ =~ /$pattern/) { print "$_\n"; } } print "Found $counter matches!\n";

Replies are listed 'Best First'.
Re: Section of $string into new array
by Eily (Parson) on Mar 17, 2014 at 17:56 UTC

    I won't give you the whole solution, just tell you what went wrong (at least what I saw) :

    Reading through you file

    while (<INPUt>) { push @array, $_; }
    This does the same thing as : @array = <INPUT>, but since you are processing your file line by line, you don't have to use an array. So in your case you can do:
    while (<INPUT>) { if (/$pattern/) # matches work by default on $_ { do something with $_; } }

    Your $pattern variable

    Your $pattern variable is the empty string, because $pattern = /regex/ means "process /regex/ on the default variable $_ then put the result in $pattern". Since $_ is still empty, the result has to be false. This will work if you just make $pattern a string (or a qred expression in more advanced perl).

    While programming, if you need to paste 4 times the same thing in a row, you did it wrong. In a regexp, to repeat some part you can use the {n,m} syntax. Something like /([0-9][0-5]\.){3}/ # three times a two-digit number followed by a dot. Or you can interpolate an inner pattern into the bigger one with something like:

    $inner = "([0-5][0-9]|[0-9][1-8])"; $outer = "$inner.$inner.$inner/[0-6]"; # Can also be written $outer = +join('.', $inner x 3) . "[0-6]"
    This is not the best solution, but it's probably both simple and effective enough for your needs. And that's still writing several times the same thing (unless you use the alternative syntax), but it's far easier to read, and if a change is required you don't have to copy it 3 times.

    Getting the parts (here : numbers) of your string

    Then to get the content of your match, you can either use captures (read perlquickre) or keep the split. But here you are splitting on spaces (that's what \s means in a regex), when your pattern doesn't allow spaces (which means the lines you want to split can't have any space). You can split on non-digit characters instead with : split /\D/, $_;


    using Data::Dumper is a good thing, but here you're not doing anything with it. Do print the content of your variables when in doubt, this is how you'll be able to solve your problem by yourself, faster than you would by waiting for people to read your code.

    And last advice: you can use \d instead of [0-9]

Re: Section of $string into new array
by AnomalousMonk (Chancellor) on Mar 17, 2014 at 17:58 UTC

    This may be something like what you want. It's usually best for maintainability and comprehensibility to build up complex regexes from simpler components. And then there's also Regexp::Common (see esp. Regexp::Common::net).

    c:\@Work\Perl\monks>perl -wMstrict -le "my $octet = qr{ 25 [0-5] | 2 [0-4] \d | [01]? \d [0-9]? }xms; my $ip = qr{ (?<! \d) $octet (?: \. $octet){3} (?! \d) }xms; my $addr = qr{ $ip / \d{1,2} (?! \d) }xms; ;; my $s = 'x y z ' . '9999.9.9.999/1 z' ; ;; my @addrs = $s =~ m{ $addr }xmsg; printf qq{'$_' } for @addrs; print ''; ;; ;; use Regexp::Common qw(net); ;; my @common = $s =~ m{ (?<! \d) $RE{net}{IPv4} / \d{1,2} (?! \d) }xmsg +; printf qq{'$_' } for @common; " '' '' '' '' '' ''

    Update: Added Regexp::Common::net example.

Re: Section of $string into new array
by Lennotoecom (Pilgrim) on Mar 17, 2014 at 17:59 UTC
    could you provide a piece of an original data
    a couple of lines would be ok (change the crucial top secret
    information with something similar but meaningless)
    and what exactly do you want to extract out of it
Re: Section of $string into new array
by GeorgMN (Acolyte) on Mar 18, 2014 at 12:08 UTC
    Thank you all for your help. I will try this out. G

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1078633]
Front-paged by Arunbear
[marto]: not if you keep telling people :P
[zentara]: I had to disable hyperthreading on my machine because of that.
[Discipulus]: thanks marto but... at the moment this seems a bit beyond my faculties.. ;=)
[zentara]: a thread goes into a wait state after emitting a wait to all other threads. It locks up a machine completely.
[marto]: if you mean the recent hyperthreading bug, I don't think that's NSA related?
[zentara]: s/thread/cpu/
[marto]: xkcd://538
[Discipulus]: these opcodes mentioned in the wiki pages are the same opcode perl is translated into? or is just the same term but in different fields?
[zentara]: it's useful if you want to lockup a multicore machine :-)

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (11)
As of 2017-07-28 12:10 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (428 votes). Check out past polls.