Parsind data in xlsx

Rahul Gupta has asked for the wisdom of the Perl Monks concerning the following question:

Comment on Parsind data in xlsx

Replies are listed 'Best First'.
Re: regex optional word match by moritz (Cardinal) on Jul 26, 2012 at 09:43 UTC
What data do you need to extract? And what have you tried, what problems did you have? Perl 6 - the future is here, just unevenly distributed	[reply]
Re^2: regex optional word match by Rahul Gupta (Sexton) on Jul 26, 2012 at 10:03 UTC
I have tried this regular expression `$_ =~ m/^REMOTE\s+\[(.?)\]\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+\[(.?)\]\s+(.?)sec\s+(.)MBytes\s+(.*)Mbits\/sec` . it gives data from both strings.But <cd> String1:"REMOTE Mon Jul 16 21:49:33 2012 @@ ueh1 TNT 20490 1916 0.0- 1.0 sec 0.33 MBytes 2.74 Mbits/sec 6.056 ms 0/ 233 (0%)"; </c>having two values in last 6.056 ms and 0/ 233(0%) i have to make them optional. Please help me in this problem Thanx	[reply] [d/l]
Re^3: regex optional word match by prashantktyagi (Scribe) on Jul 26, 2012 at 10:18 UTC
Can you be more clear about the problem? and please format the query ...follow Writeup Formatting Tips	[reply]
Re^4: regex optional word match by Rahul Gupta (Sexton) on Jul 26, 2012 at 10:54 UTC
Re^5: regex optional word match by marto (Cardinal) on Jul 26, 2012 at 10:59 UTC
Some notes below your chosen depth have not been shown here
Re: regex optional word match by kcott (Archbishop) on Jul 26, 2012 at 12:53 UTC
I wasn't entirely sure which of the square brackets you wanted to capture. Just swap instances of `( \[ [^]]+ \] ) \s+` [download] with `\[ ( [^]]+ ) \] \s+` [download] (and vice versa) to suit. #!/usr/bin/env perl use 5.010; use strict; use warnings; my $s1 = "REMOTE [Mon Jul 16 21:49:33 2012] @@ [ueh1] [TNT] [20490 +] [1916] 0.0- 1.0 sec 0.33 MBytes 2.74 Mbits/sec 6.056 ms 0/ +233 (0%)"; my $s2 = "REMOTE [Mon Jul 16 21:49:34 2012] @@ [pdn1] [SSH] [20499 +] [3] 1.0- 2.0 sec 0.34 MBytes 2.86 Mbits/sec"; my @strings = ($s1, $s2); my $re = qr{ \A REMOTE \s+ \[ ( [^]]+ ) \] \s+ ( @@ ) \s+ ( \[ [^]]+ \] ) \s+ ( \[ [^]]+ \] ) \s+ ( \[ [^]]+ \] ) \s+ \[ ( [^]]+ ) \] \s+ ( .+? ) \s+ sec \s+ ( .+? ) \s+ MBytes \s+ ( .+? ) \s+ Mbits/sec (?> \s+ ( .+? ) \s+ ms \s+ ( .* ) \| ) \z }x; for (@strings) { say for ('-' x 60, $_, '-' x 60); m{$re}; say for grep { defined } ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, + $11); } [download] Output: $ pm_str_remote_re.pl ------------------------------------------------------------ REMOTE [Mon Jul 16 21:49:33 2012] @@ [ueh1] [TNT] [20490] [1916] +0.0- 1.0 sec 0.33 MBytes 2.74 Mbits/sec 6.056 ms 0/ 233 (0%) ------------------------------------------------------------ Mon Jul 16 21:49:33 2012 @@ [ueh1] [TNT] [20490] 1916 0.0- 1.0 0.33 2.74 6.056 0/ 233 (0%) ------------------------------------------------------------ REMOTE [Mon Jul 16 21:49:34 2012] @@ [pdn1] [SSH] [20499] [3] 1.0 +- 2.0 sec 0.34 MBytes 2.86 Mbits/sec ------------------------------------------------------------ Mon Jul 16 21:49:34 2012 @@ [pdn1] [SSH] [20499] 3 1.0- 2.0 0.34 2.86 [download] -- Ken	[reply] [d/l] [select]
Re^2: regex optional word match by Rahul Gupta (Sexton) on Jul 27, 2012 at 05:28 UTC
Thanx, it worked for me :) :)	[reply]
Re: regex optional word match by brx (Pilgrim) on Jul 26, 2012 at 11:38 UTC
update: input data changed by OP (backets added) - consider kcott's answer : Re: regex optional word match ----- If you really-really want to use a big regex, see: `$line =~ m/^REMOTE\s+(.?)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.?)\s+( +.?)sec\s+(.)MBytes\s+(.)Mbits\/sec(.)/s; print "{$10}"; #$10 should contain at least "\n" except if you 'ch +omp $line'` [download] This regex become very complex if 'MBytes' could be 'Bytes', etc. But here is another way with split (because lines seem to be well formated with "space-separator"). #!perl use strict; use warnings; while (my $line = <DATA>) { chomp $line; my @parts = split /\s+/,$line; #REMOTE Mon Jul 16 21:49:33 2012 @@ ueh1 TNT 20490 1916 0.0- 1.0 sec 0 +.33 MBytes 2.74 Mbits/sec 6.056 ms 0/ 233 (0%) #0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 +4 15 16 17 18 19 20 21 22 next if $parts[0] ne 'REMOTE'; print "IN: $line\n"; print "\tOUT: ",join " ",@parts[1..5,8,12,13]; print " ", join " ",@parts[18,19] if $#parts>=19; print "\n"; } __DATA__ REMOTE Mon Jul 16 21:49:33 2012 @@ ueh1 TNT 20490 1916 0.0- 1.0 sec 0. +33 MBytes 2.74 Mbits/sec 6.056 ms 0/ 233 (0%) REMOTE Mon Jul 16 21:49:34 2012 @@ pdn1 SSH 20499 3 1.0- 2.0 sec 0.34 +MBytes 2.86 Mbits/sec LOCAL Mon Jul 16 21:49:34 2012 @@ nada [download] Output: `IN: REMOTE Mon Jul 16 21:49:33 2012 @@ ueh1 TNT 20490 1916 0.0- 1.0 se +c 0.33 MBytes 2.74 Mbits/sec 6.056 ms 0/ 233 (0%) OUT: Mon Jul 16 21:49:33 2012 TNT 1.0 sec 6.056 ms IN: REMOTE Mon Jul 16 21:49:34 2012 @@ pdn1 SSH 20499 3 1.0- 2.0 sec 0 +.34 MBytes 2.86 Mbits/sec OUT: Mon Jul 16 21:49:34 2012 SSH 2.0 sec` [download] update: /s modifier in regex English is not my mother tongue. Les tongues de ma mère sont "made in France".	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom