Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: extract line

by kcott (Abbot)
on Jul 21, 2013 at 06:45 UTC ( #1045490=note: print w/ replies, xml ) Need Help??


in reply to extract line

G'day lallison,

Welcome to the monastery.

This code does what you describe as being wanted:

$ perl -Mstrict -Mwarnings -e ' use autodie; use Tie::File; my $re = qr{^((\d+).+$)}s; my %data_for_part; open my $f1, "<", "pm_1045452_file1.txt"; while (<$f1>) { /$re/; $data_for_part{$2} = $1; } close $f1; tie my @file2, "Tie::File", "pm_1045452_file2.txt"; print $data_for_part{$_} for @file2; untie @file2; ' 3478749:AA:1D:AAA:DescriptionsY:C:2 3633731:AA:3E:AAA:DescriptionsZ:C:2

I made a minor change to "File1" to show different Descriptions:

$ cat pm_1045452_file1.txt 3478748:AA:1D:AAA:DescriptionsX:C:2 3478749:AA:1D:AAA:DescriptionsY:C:2 3633731:AA:3E:AAA:DescriptionsZ:C:2

"File2" data is as you show it:

$ cat pm_1045452_file2.txt 3478749 3633731

Notes:

  • You don't need to chomp any input nor add any newlines to the output.
  • There's no temporary arrays to process.
  • Tie::File comes standard with Perl: you won't need to install it.
"This file has over a million lines and is running very slow."

Given that you've been provided with a number of solutions, use Benchmark to determine which works best for you. (That module also comes standard with Perl.)

[Aside: The code you posted is difficult to read due to the <code> tag issue. You appear to have made an effort but were unsuccessful: see Writeup Formatting Tips for how, where and why to do it.]

-- Ken


Comment on Re: extract line
Select or Download Code
Re^2: extract line
by lallison (Novice) on Jul 21, 2013 at 17:28 UTC
    are you running the file with cat pm_1045452_file2.txt statement?

      If you're unfamiliar with *nix OSes, perhaps what I posted requires a little further explanation:

      • The actual code I ran is the "perl -Mstrict -Mwarnings -e ' ... '" part (see perlrun).
      • The two lines immediately following that second single quote is the output produced by the print statement.
      • cat is a commonly used *nix command (unrelated to Perl) that prints the contents of file(s). You can read "$ cat pm_1045452_file1.txt" as "Here's the contents of the file pm_1045452_file1.txt:". This is entirely unrelated to the Perl code; it merely shows the data the Perl code is using (which, as stated, I had slightly modified).

      [In case you didn't know, "*nix" is just an umbrella term for any UNIX-like OS.]

      -- Ken

Re^2: extract line
by lallison (Novice) on Jul 21, 2013 at 17:46 UTC
    what should $2 refer to?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1045490]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2015-07-03 06:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (48 votes), past polls