Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re: How to make unique entries

by hv (Prior)
on Jun 02, 2023 at 00:12 UTC ( [id://11152596] : note . print w/replies, xml ) Need Help??

in reply to How to make unique entries

Let's assume you have split out the 4 elements as variables $id1, $id2, $sequence, $label. The next thing you need to create is the signature that represents a "unique" value, by combining $id2 and $sequence: simplest is if you can join them with some character known not to appear in either value - from the example above I will guess that the pipe character '|' is safe to use:

my $signature = join '|', $id2, $sequence;

Now you can use this signature as the key in a hash. For simplicity, I'll use this to store the entire structure:

my %hash; # somewhere before you start to loop over the data ... # within the loop over your data my $signature = join '|', $id2, $sequence; my $structure = { id1 => $id1, id2 => $id2, sequence => $sequence, label => $label, }; $hash{$signature} = $structure; # save it

In the case of duplicate signatures this overwrites, so ends up saving a structure for the last example of any given signature, but there are other strategies possible.

You can then emit the data by looping over the hash something like:

for my $signature (keys %hash) { my $structure = $hash{$signature}; printf "%s|%s\n%s\n%s\n", $structure->{id1}, $structure->{id2}, $structure->{sequence}, $structure->{label}; }

Replies are listed 'Best First'.
Re^2: How to make unique entries
by Fletch (Bishop) on Jun 02, 2023 at 11:54 UTC

    One note in addition is that if you’re having a hard time finding a “safe” unused character remember Perl can handle nulls in strings just fine so "\0" is an option for the join char.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.