bluray has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks,
I am having trouble in getting the double quotes for a particular column (column #2) which is numeral. I tried using the code below, but I am not getting the results in the output file, instead it is displaying on the terminal.
#!/usr/bin/perl -w
use strict;
use warnings;
open (my $FILE1, '<', "quote1.csv") or die "cannot open file1 $!\n";
open (my $FILE2, '>', "quoteremoved.csv") or die "cannot open file3 $!
+\n";
my %custom;
while (my $line=<$FILE1>){
chomp $line;
my @columns=split(/\t/,$line);
my $Uni=$columns[2];
$Uni=~tr/"//d;
$custom{$Uni}=$line;
}
foreach my $Uni (keys %custom){
print $FILE2, "$custom{$Uni}\n";
}
Re: Doublequotes in column
by Eliya (Vicar) on Dec 03, 2011 at 05:35 UTC
|
print $FILE2 "$custom{$Uni}\n";
| [reply] [d/l] |
Re: Doublequotes in column
by davido (Cardinal) on Dec 03, 2011 at 05:38 UTC
|
...instead it is displaying on the terminal.
print $FILE2, "$custom{$Uni}\n";
Should be:
print $FILE2 "$custom{$Uni}\n";
That ill-placed comma is an issue.
| [reply] [d/l] [select] |
|
Hi Dave,
Thanks. I am now getting the result, but still the double quotes ("645632") are seen for the numerals when I open it as a txt file.
| [reply] |
|
That's because you're performing your transliteration on the value of $Uni, and using it as a hash key, but then when you output to the new file, you're outputting the hash's value as indexed by the key. The value itself started out as $line, and was never altered in your code (aside from being chomped).
Why not use Text::CSV instead?
| [reply] |
Re: Doublequotes in column
by JavaFan (Canon) on Dec 03, 2011 at 10:54 UTC
|
Let's see:
- You're acting on the third column, not the second.
- You never print the the modified column.
- You actually never modify the line.
- There's a comma after the file handle.
- Your indentation is border line insane.
The latter really irks me. The least you can do when asking people for help is take a handful of seconds and indent your code in a reasonable way. | [reply] |
|
My Apologies. It was at 2 am at night. You are right, I was acting on the third column. The comma after the file handle is a mistake. I could have indent it in a more readable way .
| [reply] |
Re: Doublequotes in column
by Marshall (Canon) on Dec 03, 2011 at 11:30 UTC
|
#!/usr/bin/perl -w
use strict;
use warnings;
open (my $IN, '<', "quote1.csv") or die "cannot open file1 $!\n";
open (my $OUT, '>', "quoteremoved.csv") or die "cannot open file3 $!\n
+";
while (my $line=<$IN>)
{
$line =~ tr/"//d; #delete all "
print $OUT $line;
}
Update: I'm sure that some Monk will point out the silly error of my ways, but it doesn't appear to me that the %custom hash accomplishes much.
Update2: Oh, I see I misinterpreted some other posts. I guess the idea is to only remove the " characters from the 3rd field? I'm up too late... tomorrow is already here. | [reply] [d/l] |
|
$Uni =~ tr/"//; # trailing 'd' has no meaning here
$line =~ tr/"//; # this deletes all " characters
Sorry, but that is just wrong.
Without the "trailing d", tr/// counts, but otherwise does nothing:
$s = '"bill","1","fred"';;
$s =~ tr["][];;
print $s;;
"bill","1","fred"
$s =~ tr["][]d;;
print $s;;
bill,1,fred
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
| [reply] [d/l] [select] |
|
I'm sorry too. The 'd' does matter.
I hit the post button too soon.
I still haven't figured out what this %custom hash does. But it is 4 AM here....
| [reply] |
|
Hi Marshall,
I tried the code, but it didn't remove the double quotes when opened as text file. I tried it on the whole $line and on $Uni (3rd column). No effect. Then, I had to manually cut the 3rd column into a text file and open it in spreadsheet to get rid of the doublequotes. Basically, this file was created using text::CSV_XS with the option always_quote=>1. The reason I wanted to get rid of this double quote is that this created problems when column 3 is used to match with another file.
| [reply] |
|
Well, tr will remove the " characters, like below...
tr is a simple minded critter, but it is faster than substitute.
If you created this with Text::CSV, then I would use that to parse it back in. Normally, I don't think that you have to specify the XS version, if its there, then it gets used - to the best of my knowledge.
I'm still on a marathon DB project, but my machine has to think for 4-5 hours about what I've done so far. But on this project, | is used as the CSV delimiter instead of "," and that often works out very well. Sometimes I also see || and that is ok too (in the split, you can specify /\|\|/ as the splitting regex. In the DB that I'm working with | is explicitly not allowed as a valid data field value and hence it can be used as a simple delimiter in the CSV format and the Text::CSV module is not needed. But mileage varies...
#!/usr/bin/perl -w
use strict;
while (<DATA>)
{
tr/"//d;
print;
}
# prints:
# some,always,quoted stuff
# more,stuff,like that
__DATA__
"some","always","quoted stuff"
"more","stuff","like that"
| [reply] [d/l] |
|
|