Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Parsing Data in xlsx file

by Rahul Gupta (Sexton)
on Aug 01, 2012 at 18:01 UTC ( #984855=perlquestion: print w/ replies, xml ) Need Help??
Rahul Gupta has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone.

I have some data in text file as given below

------|----------|--------|-------------------------|--------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | ------|----------|--------|-------------------------|--------| 1 | 17:20:59 | ueh1 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------| 2 | 17:20:59 | ueh2 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------|
file contains same data 5 times. now, i need to parse the same thing in xlsx file. in following format
S.No TimeStamp UE-Name Test-Config UL/D +L 1 17:20:59 ueh1 Sanity_004_0004_24_00 DL + 2 17:20:59 ueh2 Sanity_004_0004_24_00 DL +

can u please help me out to resolve this problem.

thanks in advance

Comment on Parsing Data in xlsx file
Select or Download Code
Re: Parsing Data in xlsx file
by moritz (Cardinal) on Aug 01, 2012 at 18:05 UTC
      Hi Guys, I have this data in text file
      ------|----------|--------|-------------------------|--------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | ------|----------|--------|-------------------------|--------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------|
      i have tried this thing all text file data is in @executioncontents array
      my %data; foreach (@executioncontents) { my $currLine = $_; my @Report = split(/\|/, $currLine); my $arraysize = $#Report; if ($arraysize == 11) { foreach (@Report) { if ($_ =~ m/--/) { } else { my $timestamp = $Report[1]; my $testcase = $Report[4]; my $uename = $Report[2]; my $keyvalues =$timestamp."_".$uename."_".$testcas +e; $data{$keyvalues}{serial_number} = $Report[0]; $data{$keyvalues}{time_Stamp} = $Report[1]; $data{$keyvalues}{ue_name} = $Report[2]; $data{$keyvalues}{test_config} = $Report[3]; $data{$keyvalues}{ul_dl} = $Report[4]; $data{$keyvalues}{status} = $Report[5]; $data{$keyvalues}{jiter} = $Report[6]; $data{$keyvalues}{throughput} = $Report[7]; $data{$keyvalues}{lost} = $Report[8]; $data{$keyvalues}{bandwidth} = $Report[9]; $data{$keyvalues}{duration} = $Report[10]; $data{$keyvalues}{failed_reason} = $Report[11]; } } } } readhash(\%data); sub readhash { my $reff = shift; my %xyz = %$reff; my $workbook = Excel::Writer::XLSX->new( "Data"); my $worksheet = $workbook->add_worksheet("data"); my @array; my $rows = 14; foreach my $key (sort %xyz) { if ($xyz{$key}{ue_name} =~ m/ue|NA/) { push(@array, $xyz{$key}{ue_name}); $worksheet->write($rows, 0, $xyz{$key}{serial_number}); $worksheet->write($rows, 1, $xyz{$key}{time_Stamp}); $worksheet->write($rows, 2, $xyz{$key}{ue_name}); $worksheet->write($rows, 3, $xyz{$key}{test_config}); $worksheet->write($rows, 4, $xyz{$key}{ul_dl}); $worksheet->write($rows, 5, $xyz{$key}{status}); $worksheet->write($rows, 6, $xyz{$key}{jiter}); $worksheet->write($rows, 7, $xyz{$key}{throughput}); $worksheet->write($rows, 8, $xyz{$key}{lost}); $worksheet->write($rows, 9, $xyz{$key}{bandwidth}); $worksheet->write($rows, 10, $xyz{$key}{duration}); $worksheet->write($rows, 11, $xyz{$key}{failed_reason}); $rows++; } } }

      It parse the data in xlsx file in differnt manner from the way it is in acutal file

      5 -NA- -NA- Sanity_001_0001_10_00 Syst +em 3 18:23:48 ueh1 Sanity_003_0003_06_09 DL + 3 18:23:48 ueh1 Sanity_003_0003_06_09 UL + 7 18:23:48 ueh2 Sanity_003_0003_06_09 DL + 7 18:23:49 ueh2 Sanity_003_0003_06_09 UL + 1 18:24:38 -NA- Sanity_001_0001_10_00 Syst +em 2 18:31:34 ueh1 Sanity_002_0002_00_11 UL + 6 18:31:34 ueh2 Sanity_002_0002_00_11 UL + 4 18:38:30 ueh1 Sanity_004_0004_24_00 DL + 8 18:38:30 ueh2 Sanity_004_0004_24_00 DL + S.No TimeStamp UE-Name Test-Config UL/D +L

      Please anyone can help me to parse the data in xlsx file the way it is in actal file.

      Thanks in advance

        Somehow your parsing code and your data don't seem to be related at all. For example your code checks if the data has 11 columns, but your data only has 5 columns. So no wonder it doesn't work. (Your data sample has 11 interesting rows, but I guess that's more of a coincidence.

        I'd approach the problem like this:

        use strict; use warnings; use 5.010; while (<DATA>) { # skip separator lines next if /^----/; my @columns = split /\s*\|\s*/; # remove leading space $columns[0] =~ s/^\s+//; # now do something with @columns # you'll want to do something else here, not just # printing to the console say join ',', @columns; } __DATA__ ------|----------|--------|-------------------------|--------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | ------|----------|--------|-------------------------|--------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------|

        Rahul Gupta:

        After glancing at your code, I'd suggest you sanitize it a bit. By sanitize, I don't mean "clean up"--I mean "make less insane". If you name a function readhash I would expect that it would do something like, say, reading a hash from some source. I certainly wouldn't expect it to write a spreadsheet. Perhaps naming it something like "write_spreadsheet" would be less confusing?

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: Parsing Data in xlsx file
by suaveant (Parson) on Aug 01, 2012 at 18:06 UTC
    Looks like you need to parse, so Spreadsheet::XLSX

                    - Ant
                    - Some of my best work - (1 2 3)

Re: Parsing Data in xlsx file
by Kenosis (Priest) on Aug 02, 2012 at 16:34 UTC

    Perhaps the following will work for you:

    use Modern::Perl; use Excel::Writer::XLSX; my ( $rowNumber, %headingsSeen ) = 1; my $workbook = Excel::Writer::XLSX->new('data.xlsx'); my $worksheet = $workbook->add_worksheet(); while (<DATA>) { next if /^[-#+]/ or ( /S.No/ and $headingsSeen{/S.No/}++ ); $worksheet->write( 'A' . $rowNumber++ , [ grep s/\A\s*(.*?)\s*\z/$1/, ( split /\|/ )[ 0 .. 11 ] ] ); } __DATA__ ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | Status +| Jitter ms | Throughput Mb/s | lost % | Bandwidth Mb/ +s | Duration Sec | Fail-Reason | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | Passed +| -NA- | -NA- | -NA- | -NA- + | -NA- | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | Passed +| 0.14 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | Passed +| 3.37 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | Passed +| 1.34 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | -NA- +| -NA- | -NA- | -NA- | -NA- + | -NA- | Micro-Step | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | Passed +| 0.15 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | Passed +| 3.38 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | Passed +| 1.32 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| # Total Test-Cases = '4' # Total Passed-Cases = '4' # Total Failed-Cases = '0' ++++++ SUMMARY ++++++ ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | Status +| Jitter ms | Throughput Mb/s | lost % | Bandwidth Mb/ +s | Duration Sec | Fail-Reason | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | Passed +| -NA- | -NA- | -NA- | -NA- + | -NA- | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | Passed +| 0.14 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | Passed +| 3.37 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | Passed +| 1.34 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | -NA- +| -NA- | -NA- | -NA- | -NA- + | -NA- | Micro-Step | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | Passed +| 0.15 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | Passed +| 3.38 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | Passed +| 1.32 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| # Total Test-Cases = '4' # Total Passed-Cases = '4' # Total Failed-Cases = '0'

    Output (copied from generated data.xlsx spreadsheet):

    S.No TimeStamp UE-Name Test-Config UL/DL Status Jitt +er ms Throughput Mb/s lost % Bandwidth Mb/s Duration Sec + Fail-Reason 1 18:24:38 -NA- Sanity_001_0001_10_00 System Passed +-NA- -NA- -NA- -NA- -NA- -NA- 2 18:31:34 ueh1 Sanity_002_0002_00_11 UL Passed 0.14 + of 10.00 4.38 of 3.97 0.00 of 10.00 4.38 360 -NA- 3 18:23:48 ueh1 Sanity_003_0003_06_09 UL Passed 0.16 + of 10.00 4.00 of 3.62 0.00 of 10.00 4 360 -NA- 3 18:23:48 ueh1 Sanity_003_0003_06_09 DL Passed 3.37 + of 10.00 5.18 of 4.67 0.00 of 10.00 5.17 360 -NA- 4 18:38:30 ueh1 Sanity_004_0004_24_00 DL Passed 1.34 + of 10.00 13.72 of 12.40 0.00 of 10.00 13.7 360 -NA- 5 -NA- -NA- Sanity_001_0001_10_00 System -NA- -NA- + -NA- -NA- -NA- -NA- Micro-Step 6 18:31:34 ueh2 Sanity_002_0002_00_11 UL Passed 0.15 + of 10.00 4.38 of 3.97 0.00 of 10.00 4.38 360 -NA- 7 18:23:49 ueh2 Sanity_003_0003_06_09 UL Passed 0.16 + of 10.00 4.00 of 3.62 0.00 of 10.00 4 360 -NA- 7 18:23:48 ueh2 Sanity_003_0003_06_09 DL Passed 3.38 + of 10.00 5.18 of 4.67 0.00 of 10.00 5.17 360 -NA- 8 18:38:30 ueh2 Sanity_004_0004_24_00 DL Passed 1.32 + of 10.00 13.72 of 12.40 0.00 of 10.00 13.7 360 -NA- 1 18:24:38 -NA- Sanity_001_0001_10_00 System Passed +-NA- -NA- -NA- -NA- -NA- -NA- 2 18:31:34 ueh1 Sanity_002_0002_00_11 UL Passed 0.14 + of 10.00 4.38 of 3.97 0.00 of 10.00 4.38 360 -NA- 3 18:23:48 ueh1 Sanity_003_0003_06_09 UL Passed 0.16 + of 10.00 4.00 of 3.62 0.00 of 10.00 4 360 -NA- 3 18:23:48 ueh1 Sanity_003_0003_06_09 DL Passed 3.37 + of 10.00 5.18 of 4.67 0.00 of 10.00 5.17 360 -NA- 4 18:38:30 ueh1 Sanity_004_0004_24_00 DL Passed 1.34 + of 10.00 13.72 of 12.40 0.00 of 10.00 13.7 360 -NA- 5 -NA- -NA- Sanity_001_0001_10_00 System -NA- -NA- + -NA- -NA- -NA- -NA- Micro-Step 6 18:31:34 ueh2 Sanity_002_0002_00_11 UL Passed 0.15 + of 10.00 4.38 of 3.97 0.00 of 10.00 4.38 360 -NA- 7 18:23:49 ueh2 Sanity_003_0003_06_09 UL Passed 0.16 + of 10.00 4.00 of 3.62 0.00 of 10.00 4 360 -NA- 7 18:23:48 ueh2 Sanity_003_0003_06_09 DL Passed 3.38 + of 10.00 5.18 of 4.67 0.00 of 10.00 5.17 360 -NA- 8 18:38:30 ueh2 Sanity_004_0004_24_00 DL Passed 1.32 + of 10.00 13.72 of 12.40 0.00 of 10.00 13.7 360 -NA-

    Hope this helps!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://984855]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (8)
As of 2014-09-16 05:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (156 votes), past polls