Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re: extract line

by kcott (Chancellor)
on Jul 21, 2013 at 06:45 UTC ( #1045490=note: print w/replies, xml ) Need Help??

in reply to extract line

G'day lallison,

Welcome to the monastery.

This code does what you describe as being wanted:

$ perl -Mstrict -Mwarnings -e ' use autodie; use Tie::File; my $re = qr{^((\d+).+$)}s; my %data_for_part; open my $f1, "<", "pm_1045452_file1.txt"; while (<$f1>) { /$re/; $data_for_part{$2} = $1; } close $f1; tie my @file2, "Tie::File", "pm_1045452_file2.txt"; print $data_for_part{$_} for @file2; untie @file2; ' 3478749:AA:1D:AAA:DescriptionsY:C:2 3633731:AA:3E:AAA:DescriptionsZ:C:2

I made a minor change to "File1" to show different Descriptions:

$ cat pm_1045452_file1.txt 3478748:AA:1D:AAA:DescriptionsX:C:2 3478749:AA:1D:AAA:DescriptionsY:C:2 3633731:AA:3E:AAA:DescriptionsZ:C:2

"File2" data is as you show it:

$ cat pm_1045452_file2.txt 3478749 3633731


  • You don't need to chomp any input nor add any newlines to the output.
  • There's no temporary arrays to process.
  • Tie::File comes standard with Perl: you won't need to install it.
"This file has over a million lines and is running very slow."

Given that you've been provided with a number of solutions, use Benchmark to determine which works best for you. (That module also comes standard with Perl.)

[Aside: The code you posted is difficult to read due to the <code> tag issue. You appear to have made an effort but were unsuccessful: see Writeup Formatting Tips for how, where and why to do it.]

-- Ken

Replies are listed 'Best First'.
Re^2: extract line
by lallison (Novice) on Jul 21, 2013 at 17:28 UTC
    are you running the file with cat pm_1045452_file2.txt statement?

      If you're unfamiliar with *nix OSes, perhaps what I posted requires a little further explanation:

      • The actual code I ran is the "perl -Mstrict -Mwarnings -e ' ... '" part (see perlrun).
      • The two lines immediately following that second single quote is the output produced by the print statement.
      • cat is a commonly used *nix command (unrelated to Perl) that prints the contents of file(s). You can read "$ cat pm_1045452_file1.txt" as "Here's the contents of the file pm_1045452_file1.txt:". This is entirely unrelated to the Perl code; it merely shows the data the Perl code is using (which, as stated, I had slightly modified).

      [In case you didn't know, "*nix" is just an umbrella term for any UNIX-like OS.]

      -- Ken

Re^2: extract line
by lallison (Novice) on Jul 21, 2013 at 17:46 UTC
    what should $2 refer to?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1045490]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2018-01-20 05:20 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (226 votes). Check out past polls.