Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Search and Copy

by kennethk (Monsignor)
on Nov 23, 2012 at 15:35 UTC ( #1005281=note: print w/ replies, xml ) Need Help??


in reply to Search and Copy

What have you tried? What didn't work? As well, please wrap sample input in <code> tags, so formatting is preserved. See How do I post a question effectively?.

Your spec can be hit using regular expressions. Perhaps something like /(web site.{250})/i. You can then open your output file and print the contents of the capture buffer ($1).


#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.


Comment on Re: Search and Copy
Select or Download Code
Re^2: Search and Copy
by Athanasius (Prior) on Nov 23, 2012 at 15:53 UTC

    Assuming the specification is, “print the key phrase together with the following text up to 250 characters,” the regex would be better as /(web site.{0,250})/, which also matches when the key phrase is followed by fewer than 250 characters of text before the end of the file. As this match is greedy, it will match the largest number of characters up to 250.

    Athanasius <°(((><contra mundum

      This is what I have so far... open (OUTPUT, ">Results.txt") || die ("Could not open file results.txt; $OS_ERROR"); open( INFILE, "Textfile.txt" )or die("Can not open input file: $!"); while (<INFILE>) { if ($ARG =~ /Something/ ) { print OUTPUT $ARG ; } } close (OUTPUT);

        Again, please use <code> tags and read How do I post a question effectively?

        I have found strict to be very helpful to identifying and avoiding bugs in my code -- see Use strict warnings and diagnostics or die. If I were going to write your posted code, it might look more like:

        use strict; use warnings; open (my $out, ">", "Results.txt") or die ("Could not open file Result +s.txt; $!"); open (my $in, "<", "Textfile.txt") or die ("Can not open input file: $ +!"); local $/; while (<$in>) { if (/(web site.{250})/i) { print $out $1; } }

        Changes that I made include:

        1. I swapped to lexical file handles and 3 argument open, which are considered better practice for a number of reasons. See perlopentut. In particular, this gives strict more power to help and removes the need for explicit close.
        2. I corrected inconsistency between your file name and error message; file names are generally case sensitive.
        3. I swapped to slurp mode using $/. Given the large number of characters you are interested in, it is unlikely they will all fall on the same line.
        4. Your while(<>) loop read data into $_ not $ARG, so I corrected that.
        5. I swapped your regular expression to the regular expression I posted above, with the addition of the s modifier. This makes it so . also matches new lines, and is essential when working in slurp mode.

        You may consider going to http://learn.perl.org to gather some learning resources before trying to run too far.


        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        open (OUTPUT, ">Results.txt") open( INFILE, "Textfile.TXT" ) while (<INFILE>) { if ($ARG =~ /(Something.{0,250})/ ) { print OUTPUT $ARG ; } } close (OUTPUT);

        The "regexp" suggested is not working.

        Thoughts?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1005281]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (16)
As of 2014-07-10 12:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (209 votes), past polls