Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

create hashes using regex

by bingalee (Acolyte)
on Jun 12, 2013 at 17:33 UTC ( #1038525=perlquestion: print w/ replies, xml ) Need Help??
bingalee has asked for the wisdom of the Perl Monks concerning the following question:

SO basically I have a file that looks like this

TCONS_00000047 XLOC_000039

TCONS_00000718 XLOC_000456

TCONS_00000938 XLOC_000610

TCONS_00004086 XLOC_002872

TCONS_00004252 XLOC_003003

TCONS_00004975 XLOC_003624

TCONS_00004976 XLOC_003624

TCONS_00005492 XLOC_004020

there are more like that . How do i declare a hash for all the keys and variables at once? This is what i did

open(IN, $file)||die "Can not open file: $file\n"; %hash =( 'TCONS_00[0-9]+' => 'XLOC_[0-9]+'); @keys= keys %hash; @values= values %hash; print @keys; print @values; close(IN);

The hash took the key as a string, not a regex operation. What can i do?

Comment on create hashes using regex
Download Code
Re: create hashes using regex
by hdb (Parson) on Jun 12, 2013 at 17:45 UTC

    • Well, for one you do not link your file or file handle in any way to the declaration of the hash. How shall Perl know that you want to apply those expressions to the contents of your file?
    • Also, regular expression matching is done using the m// operator or /regexp/. Nowhere you tell Perl that you expect matching.
    • Thirdly, you need to use capture groups (...) in regular expressions to retrieve information.
    • And then you need to loop some how over the lines of your file or apply the regexes repeatedly in some way.
    If you put these points into action, it could look like the code below. I am reading from the DATA handle for convenience and read all lines into an unnamed array.

    use strict; use warnings; my %hash = map { /(TCONS_00[0-9]+)\s+(XLOC_[0-9]+)/ } <DATA>; my @keys= keys %hash; my @values= values %hash; print "@keys\n"; print "@values\n"; __DATA__ TCONS_00000047 XLOC_000039 TCONS_00000718 XLOC_000456 TCONS_00000938 XLOC_000610 TCONS_00004086 XLOC_002872 TCONS_00004252 XLOC_003003 TCONS_00004975 XLOC_003624 TCONS_00004976 XLOC_003624 TCONS_00005492 XLOC_004020
      thank you so much..
Re: create hashes using regex
by Laurent_R (Vicar) on Jun 12, 2013 at 17:49 UTC

    A quick try under the Perl debugger:

    DB<1> $c = "TCONS_00000047 XLOC_000039"; DB<2> %h = map { $1, $2 if /(TCONS_00[0-9]+) (XLOC_[0-9]+)/; } ($c); DB<3> x %h 0 'TCONS_00000047' 1 'XLOC_000039'

    EDIT: HDB typed faster than me. ;-)

Re: create hashes using regex
by frozenwithjoy (Curate) on Jun 12, 2013 at 17:54 UTC

    Here is a non-regex solution, too:

    #!/usr/bin/env perl use strict; use warnings; my %hash = map { chomp; split /\t/; } <DATA>; use Data::Printer; p %hash; __DATA__ TCONS_00000047 XLOC_000039 TCONS_00000718 XLOC_000456 TCONS_00000938 XLOC_000610 TCONS_00004086 XLOC_002872 TCONS_00004252 XLOC_003003 TCONS_00004975 XLOC_003624 TCONS_00004976 XLOC_003624 TCONS_00005492 XLOC_004020

    Output:

    { TCONS_00000047 "XLOC_000039", TCONS_00000718 "XLOC_000456", TCONS_00000938 "XLOC_000610", TCONS_00004086 "XLOC_002872", TCONS_00004252 "XLOC_003003", TCONS_00004975 "XLOC_003624", TCONS_00004976 "XLOC_003624", TCONS_00005492 "XLOC_004020" }
Re: create hashes using regex
by Cristoforo (Deacon) on Jun 12, 2013 at 19:07 UTC
    Another way (slurping the entire file into 1 string).
    #!/usr/bin/perl use strict; use warnings; my $file = <<EOF; TCONS_00000047 XLOC_000039 TCONS_00000718 XLOC_000456 TCONS_00000938 XLOC_000610 TCONS_00004086 XLOC_002872 TCONS_00004252 XLOC_003003 TCONS_00004975 XLOC_003624 TCONS_00004976 XLOC_003624 TCONS_00005492 XLOC_004020 EOF open my $fh, "<", \$file; my $slurp; do {local $/; $slurp = <$fh>}; my %hash = split ' ', $slurp; use Data::Dumper; print Dumper \%hash;
    $VAR1 = { 'TCONS_00004086' => 'XLOC_002872', 'TCONS_00004976' => 'XLOC_003624', 'TCONS_00004975' => 'XLOC_003624', 'TCONS_00000718' => 'XLOC_000456', 'TCONS_00000047' => 'XLOC_000039', 'TCONS_00005492' => 'XLOC_004020', 'TCONS_00000938' => 'XLOC_000610', 'TCONS_00004252' => 'XLOC_003003' };

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1038525]
Approved by hdb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (9)
As of 2014-07-23 09:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (138 votes), past polls