Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

rename duplicate data

by blacknight (Initiate)
on Jun 21, 2012 at 21:25 UTC ( #977715=perlquestion: print w/ replies, xml ) Need Help??
blacknight has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone,

I am new user in perl and in PerlMonk I tried to change a duplicate element of table in a file. In other word I have this is the table format

xxxx . yyy 521 916 . + . ID=OSCAR1028v1rpkm9.67

xxxx . xxx 521 567 . + . Parent=OSCAR1028v1rpkm9.67

...

xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67

xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67

...

I am interested to change only duplicated ID="OSCAR10281v1rpkm9.67" section,

in order to mantain the first occurence of ID="XXXXXX" unchanged but the second and third and so on to change them in this way

ID=OSCAR10281v1rpkm9.67".1" ID=OSCAR10281v1rpkm9.67".2" ... ".4" ..."n"

n for occurence of specific ID="xxxx"

I thank you in advance for any suggestion that you will give me.

Comment on rename duplicate data
Re: rename duplicate data
by ww (Bishop) on Jun 21, 2012 at 21:34 UTC
    Cliches ... but true and applicable:
    • What did you try? (Code)
    • How did it fail? (Details, error message)
    • We try to help you learn. (This is not code-a-matic)
    • Read the markup instructions around the text-entry box. (<p>...</p> and <c>...</c> or even Markup in the Monastery!)
Re: rename duplicate data
by monsoon (Pilgrim) on Jun 22, 2012 at 02:06 UTC

    You mean something like this?

    while(<>){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; }

    Try and get yourself a copy of the Perl Cookbook. It usually comes handy for such tasks.

      #!/usr/bin/perl; use strict; use warnings; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $ids; while($filename){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; } print say;

      I am sorry for my english and my script error

      see you

        It would help if your code was properly formatted. Anyway, to use 'say' you need to

        use v5.10;

        for "my $debug" declaration you probably meant

        my $debug = $ARGV[1];

        'ids' need to be declared as a hash, not a scalar

        my %ids;

        while loop needs to read from the file handle that you opened instead of checking the truth value of $filename variable which causes infinite loop if $filename is anything other than 0 or empty string

        while(<IN>)

        'print say' at the end doesn't really do any good

        Continuing with the issues and advice:

        Please surround your code and data listings with <c> (at the beginning) and </c> (at the end). It makes your nodes much easier to read, and thus, more likely to draw help.

        Your indentation style may have been different, but surely you saw that your code didn't look 'right' (for some value of 'right' meaning 'easy to read'):

        #!/usr/bin/perl; use strict; use warnings; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; + open(IN,$filename); my $ids; while($filename){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; } print say;

        The say in Line 18 probably doesn't do what you intended; print say; in Line 20 is meaningless.

        say is "Just like "print", but implicitly appends a newline" to quote the 5.014 doc. To use it, either include use 5.010 (or higher; 5.016 is current) as monsoon suggested or use feature qw(switch say); (where feature is a (relatively new) pragma to enable new features that are not available without specifically enabling them).

        Other issues start with the unnecessary (but harmless) semi-colon at the end of the hashbang (or, in order of severity, your incorrect attempt to invoke debug mode, as noted by monsoon.

Re: rename duplicate data
by Kenosis (Priest) on Jun 22, 2012 at 02:58 UTC

    Or a variation of the above solution:

    use Modern::Perl; my %hash; do {chomp; $_ = qq|$_"$hash{$1}"| if /(ID=.+)$/ and ++$hash{$1}; say} for <DATA>; __DATA__ xxxx . yyy 521 916 . + . ID=OSCAR1028v1rpkm9.67 xxxx . xxx 521 567 . + . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67 xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67 xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67

    Results:

    xxxx . yyy 521 916 . + . ID=OSCAR1028v1rpkm9.67"1" xxxx . xxx 521 567 . + . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67"1" xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67"2" xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67

      this is what I want but without double quote surrounding the number.

      but I say in my last mail I am a very scarce in

      programming and I written this script to use your code

      but I can able to do correctly where is my mistake.

      My data are in txt file so I make this script to read

      them. <code> #!/usr/bin/perl; use strict; use warnings; use Modern::Perl; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 exonerate output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $ids; my %hash; do {chomp; $_ = qq|$_"$hash{$1}"| if /(ID=.+)$/ and ++$hash{$1}; say} for $filename;

      but I have a error " Cant locate Modern::Perl .."

      I suppose that I don't have this module have you

      suggestion to resolve it. If I want to use the code by monsoon <code> while(<>){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; }

      its good to insert it in my script in this way <code> #!/usr/bin/perl; use strict; use warnings; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 exonerate output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $ids; while($filename){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; } print say;

        Hi, blacknight.

        In case the Modern::Perl error is still occurring, try the following to produce the results you wanted w/o "quotes":

        use strict; use warnings; my %hash; do { chomp; $_ = qq|$_.$hash{$1}| if /(ID=.+)$/ and ++$hash{$1}; print + "$_\n" } for <DATA>;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://977715]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (15)
As of 2014-07-10 15:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (213 votes), past polls