Re^12: Text::CSV encoding parse()

Hi @haukex sorry I can't give you the whole script due to security and privacy concerns, but I can give you the salient parts of it.

#execute the query using Aginity Workbench; output saved to flat filfe
my($res) = system($cmd);

### read the output
my(@urls);
my($header);
open my $fh, "<:encoding(utf8)", "$resultsFile" || die("cannot open re
+sults file $resultsFile for reading.\n cmd: $cmd");
my($c)=0;   # just here for counting
my($d)=0;   # just here for counting
while(<$fh>){
 $c++;
 if($c == 1) {    # get header row
  $header = $_;
 }
 if ($_ =~ /\/search\//){
  push(@urls, $_);
 }
 else{
  $d++;
 }
}
close($fh);

# sort @urls based on the search string 
# e.g. https://www.ibm.com/support/knowledgecenter/es/search/¿Cuales s
+on las partes de una cadena de conexión??scope=SSGU8G_12.1.0|https://
+www.ibm.com/support/knowledgecenter/es/SSGU8G_12.1.0/com.ibm.jdbc_pg.
+doc/ids_jdbc_011.htm|0|1|1|0

my @sorted_urls =
  map  { $_->[0] }
  sort { $a->[1] cmp $b->[1] }
  map  { m|/search/\s*([^\?]+)\?|; [$_, $1] }
  @urls;

# parse and print
print $q->header(-charset    => 'utf-8');
print $q->start_html( -title      => 'SearchME',
                -style=>{'src'=>$stylesheet});

print $q->start_table();
foreach my $row (@sorted_urls){
# print TEMP $row;
 $csv->parse($row);
 print "<tr>";
 $count++;


 my @els = $csv->fields;

 my(@splits) = split('\|',$row);

 $els[0] =~ /\/search\/(.+)\?scope=/i;
 my($term) = $1;

 my($link) = $els[0];

 print "<td>";
 # print $link;
 print $q->a({-href=>$link,-target=>'_blank'},$term);
 print "</td>";

# print other @fields here inside <td></td>
}
print $q->end_table,
$q->end_html;
[download]

oh and I reinstalled both modules.

Comment on Re^12: Text::CSV encoding parse() Download Code

Replies are listed 'Best First'.
Re^13: Text::CSV encoding parse() by haukex (Archbishop) on Aug 21, 2019 at 20:12 UTC
Hi @haukex sorry I can't give you the whole script due to security and privacy concerns, but I can give you the salient parts of it. I understand, but please understand that we do need to be able to reproduce the issue you're having on our end, which doesn't require you to disclose any secrets, but it does require you to give us something representative that is runnable as-is (standalone). For example, in what you've posted here, I don't see whether you've changed `STDOUT` to UTF-8, I don't see any of the Data::Dumper output that I provided in my example code (which is essential to debugging encoding issues), you don't show the output this script is producing on your end, and so on. If you take the time to read and understand Short, Self-Contained, Correct Example and I know what I mean. Why don't you?, we might be able to help you further, but I'm sorry, as it stands there simply isn't enough coherent information to help you.	[reply] [d/l]
Re^14: Text::CSV encoding parse() by slugger415 (Monk) on Aug 21, 2019 at 22:22 UTC
ok here's a full program that produces the same result. It pulls in a $resultsFile called "results.txt." #! /strawberry/Perl/bin/perl use CGI; use CGI::Carp qw( fatalsToBrowser ); use Text::CSV; use Excel::Writer::XLSX; use utf8; use strict; ### read the output my(@urls); my($header); my($resultsFile) = "results.txt"; open my $fh, "<:encoding(utf8)", "$resultsFile" \|\| die("cannot open re +sults file $resultsFile for reading."); my($c)=0; # just here for counting my($d)=0; # just here for counting while(<$fh>){ $c++; if ($_ =~ /\/search\//){ push(@urls, $_); } else{ $d++; } } close($fh); # sort @urls based on the search string my @sorted_urls = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { m\|/search/\s*([^\?]+)\?\|; [$_, $1] } @urls; my($count) = -1; my $csv = Text::CSV->new ({ binary => 1, sep_char => "\|" }); my $q = new CGI; # parse and print print $q->header(-charset => 'utf-8'); print $q->start_html( -title => 'SearchME'); print $q->start_table(); foreach my $row (@sorted_urls){ # print TEMP $row; $csv->parse($row); print "<tr>"; $count++; my @els = $csv->fields; my(@splits) = split('\\|',$row); $els[0] =~ /\/search\/(.+)\?scope=/i; my($term) = $1; my($link) = $els[0]; print "<td>"; # print $link; print $q->a({-href=>$link,-target=>'_blank'},$term); print "</td>"; for(my $i=1; $i <= 4; $i++){ print "<td>"; print $els[$i]; print "</td>"; } print "</tr>\n"; } print $q->end_table, $q->end_html; [download] And here's a short results file (not sure how to keep it from wrapping, so it's not code): PAGE_COMPL_URL\|PAGE_REFRL_COMPL_URL\|IBMER\|VIEWS\|VISITORS\|ENGAGED_VISITS https://www.ibm.com/support/knowledgecenter/es/search/¿Cuales son las partes de una cadena de conexión??scope=SSGU8G_12.1.0\|https://www.ibm.com/support/knowledgecenter/es/SSGU8G_12.1.0/com.ibm.jdbc_pg.doc/ids_jdbc_011.htm\|0\|1\|1\|0 https://www.ibm.com/support/knowledgecenter/search/onsmsync?scope=SSGU8G_12.1.0\|https://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.sec.doc/ids_lb_002.htm\|1\|1\|1\|1 Thank you.	[reply] [d/l]
Re^15: Text::CSV encoding parse() by hippo (Bishop) on Aug 22, 2019 at 08:49 UTC
Thank you for providing an SSCCE. There's a lot which could be removed from it but the first line is the one setting off the klaxons. `#! /strawberry/Perl/bin/perl` Are you running this on Microsoft Windows? If so, what have you done to confirm that your input data (your `results.txt` file) is genuinely UTF-8 encoded?	[reply] [d/l]
Re^16: Text::CSV encoding parse() by slugger415 (Monk) on Aug 22, 2019 at 18:25 UTC
Re^17: Text::CSV encoding parse() by hippo (Bishop) on Aug 23, 2019 at 13:26 UTC
Some notes below your chosen depth have not been shown here


There's more than one way to do things
	PerlMonks