http://www.perlmonks.org?node_id=984859


in reply to Parsing Data in xlsx file

Perlmonks is not a code writing service. We can help you solve your problem, but we won't solve it for you.

Which part are you having trouble with? What code have you written so far?

Excel::Writer::XLSX can write XLSX files for you, which is the hardest part of the task.

Replies are listed 'Best First'.
Re^2: Parsing Data in xlsx file
by Rahul Gupta (Sexton) on Aug 02, 2012 at 08:16 UTC
    Hi Guys, I have this data in text file
    ------|----------|--------|-------------------------|--------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | ------|----------|--------|-------------------------|--------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------|
    i have tried this thing all text file data is in @executioncontents array
    my %data; foreach (@executioncontents) { my $currLine = $_; my @Report = split(/\|/, $currLine); my $arraysize = $#Report; if ($arraysize == 11) { foreach (@Report) { if ($_ =~ m/--/) { } else { my $timestamp = $Report[1]; my $testcase = $Report[4]; my $uename = $Report[2]; my $keyvalues =$timestamp."_".$uename."_".$testcas +e; $data{$keyvalues}{serial_number} = $Report[0]; $data{$keyvalues}{time_Stamp} = $Report[1]; $data{$keyvalues}{ue_name} = $Report[2]; $data{$keyvalues}{test_config} = $Report[3]; $data{$keyvalues}{ul_dl} = $Report[4]; $data{$keyvalues}{status} = $Report[5]; $data{$keyvalues}{jiter} = $Report[6]; $data{$keyvalues}{throughput} = $Report[7]; $data{$keyvalues}{lost} = $Report[8]; $data{$keyvalues}{bandwidth} = $Report[9]; $data{$keyvalues}{duration} = $Report[10]; $data{$keyvalues}{failed_reason} = $Report[11]; } } } } readhash(\%data); sub readhash { my $reff = shift; my %xyz = %$reff; my $workbook = Excel::Writer::XLSX->new( "Data"); my $worksheet = $workbook->add_worksheet("data"); my @array; my $rows = 14; foreach my $key (sort %xyz) { if ($xyz{$key}{ue_name} =~ m/ue|NA/) { push(@array, $xyz{$key}{ue_name}); $worksheet->write($rows, 0, $xyz{$key}{serial_number}); $worksheet->write($rows, 1, $xyz{$key}{time_Stamp}); $worksheet->write($rows, 2, $xyz{$key}{ue_name}); $worksheet->write($rows, 3, $xyz{$key}{test_config}); $worksheet->write($rows, 4, $xyz{$key}{ul_dl}); $worksheet->write($rows, 5, $xyz{$key}{status}); $worksheet->write($rows, 6, $xyz{$key}{jiter}); $worksheet->write($rows, 7, $xyz{$key}{throughput}); $worksheet->write($rows, 8, $xyz{$key}{lost}); $worksheet->write($rows, 9, $xyz{$key}{bandwidth}); $worksheet->write($rows, 10, $xyz{$key}{duration}); $worksheet->write($rows, 11, $xyz{$key}{failed_reason}); $rows++; } } }

    It parse the data in xlsx file in differnt manner from the way it is in acutal file

    5 -NA- -NA- Sanity_001_0001_10_00 Syst +em 3 18:23:48 ueh1 Sanity_003_0003_06_09 DL + 3 18:23:48 ueh1 Sanity_003_0003_06_09 UL + 7 18:23:48 ueh2 Sanity_003_0003_06_09 DL + 7 18:23:49 ueh2 Sanity_003_0003_06_09 UL + 1 18:24:38 -NA- Sanity_001_0001_10_00 Syst +em 2 18:31:34 ueh1 Sanity_002_0002_00_11 UL + 6 18:31:34 ueh2 Sanity_002_0002_00_11 UL + 4 18:38:30 ueh1 Sanity_004_0004_24_00 DL + 8 18:38:30 ueh2 Sanity_004_0004_24_00 DL + S.No TimeStamp UE-Name Test-Config UL/D +L

    Please anyone can help me to parse the data in xlsx file the way it is in actal file.

    Thanks in advance

      Somehow your parsing code and your data don't seem to be related at all. For example your code checks if the data has 11 columns, but your data only has 5 columns. So no wonder it doesn't work. (Your data sample has 11 interesting rows, but I guess that's more of a coincidence.

      I'd approach the problem like this:

      use strict; use warnings; use 5.010; while (<DATA>) { # skip separator lines next if /^----/; my @columns = split /\s*\|\s*/; # remove leading space $columns[0] =~ s/^\s+//; # now do something with @columns # you'll want to do something else here, not just # printing to the console say join ',', @columns; } __DATA__ ------|----------|--------|-------------------------|--------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | ------|----------|--------|-------------------------|--------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | ------|----------|--------|-------------------------|--------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | ------|----------|--------|-------------------------|--------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | ------|----------|--------|-------------------------|--------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | ------|----------|--------|-------------------------|--------|

        Sorry that's by mistake, i put only 5 columns here in actual data i have 11 columns.if it works for 5 columns then for sure it will work for 11 colums

        . so please help me .

        Sorry, that's by mistake, I put only 5 columns here, in actual data i have 11 columns,i think if it works for 5 columns than for sure it will work for 11 columns also.

        please help me to resolve this problem.
        hi, For more details this is my actual file.
        ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | Status +| Jitter ms | Throughput Mb/s | lost % | Bandwidth Mb/ +s | Duration Sec | Fail-Reason | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | Passed +| -NA- | -NA- | -NA- | -NA- + | -NA- | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | Passed +| 0.14 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | Passed +| 3.37 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | Passed +| 1.34 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | -NA- +| -NA- | -NA- | -NA- | -NA- + | -NA- | Micro-Step | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | Passed +| 0.15 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | Passed +| 3.38 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | Passed +| 1.32 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| # Total Test-Cases = '4' # Total Passed-Cases = '4' # Total Failed-Cases = '0' ++++++ SUMMARY ++++++ ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| S.No | TimeStamp| UE-Name| Test-Config | UL/DL | Status +| Jitter ms | Throughput Mb/s | lost % | Bandwidth Mb/ +s | Duration Sec | Fail-Reason | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 1 | 18:24:38 | -NA- | Sanity_001_0001_10_00 | System | Passed +| -NA- | -NA- | -NA- | -NA- + | -NA- | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 2 | 18:31:34 | ueh1 | Sanity_002_0002_00_11 | UL | Passed +| 0.14 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 3 | 18:23:48 | ueh1 | Sanity_003_0003_06_09 | DL | Passed +| 3.37 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 4 | 18:38:30 | ueh1 | Sanity_004_0004_24_00 | DL | Passed +| 1.34 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 5 | -NA- | -NA- | Sanity_001_0001_10_00 | System | -NA- +| -NA- | -NA- | -NA- | -NA- + | -NA- | Micro-Step | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 6 | 18:31:34 | ueh2 | Sanity_002_0002_00_11 | UL | Passed +| 0.15 of 10.00 | 4.38 of 3.97 | 0.00 of 10.00 | 4.38 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:49 | ueh2 | Sanity_003_0003_06_09 | UL | Passed +| 0.16 of 10.00 | 4.00 of 3.62 | 0.00 of 10.00 | 4.00 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 7 | 18:23:48 | ueh2 | Sanity_003_0003_06_09 | DL | Passed +| 3.38 of 10.00 | 5.18 of 4.67 | 0.00 of 10.00 | 5.17 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| 8 | 18:38:30 | ueh2 | Sanity_004_0004_24_00 | DL | Passed +| 1.32 of 10.00 | 13.72 of 12.40 | 0.00 of 10.00 | 13.70 + | 360 | -NA- | ------|----------|--------|-------------------------|--------|-------- +|-----------------|-----------------|-----------------|-------------- +---|---------------|-------------| # Total Test-Cases = '4' # Total Passed-Cases = '4' # Total Failed-Cases = '0'

        same data is coming twice or sometimes thrice.

        Thanks

      Rahul Gupta:

      After glancing at your code, I'd suggest you sanitize it a bit. By sanitize, I don't mean "clean up"--I mean "make less insane". If you name a function readhash I would expect that it would do something like, say, reading a hash from some source. I certainly wouldn't expect it to write a spreadsheet. Perhaps naming it something like "write_spreadsheet" would be less confusing?

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.