http://www.perlmonks.org?node_id=940316

rockstar99 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Gurus, Could some help me with the logic of how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array? Thanks in advance.

  • Comment on how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
  • Watch for: Direct replies / Any replies

Replies are listed 'Best First'.
Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by CountZero (Bishop) on Nov 28, 2011 at 07:06 UTC
    Assuming the end of a paragraph is indicated by an empty line or --in other words-- "\n\n", you can set the special variable $/ to "\n\n" and then each iteration of <filehandle> will read a whole paragraph. Stop reading when you have reached the target paragraph

    Next you split that paragraph on "\n" to get each line and the you split each line on \s (whitespace) to get each word.

    use Modern::Perl; use Data::Dump qw/dump/; my $paragraph; { local $/ = "\n\n"; $paragraph = <DATA> for 1 .. 3; # we need the third paragraph } my @words; push @words, map {[split /\s/]} split /\n/, $paragraph; say dump(@words); __DATA__ This is the start of the first paragraph. This is the second line of this first paragraph. And this is the last line of it. Here starts the second paragraph. This is the second line of this second paragraph. And this is the last line of it. Here starts the third paragraph. This is the second line of this third paragraph. And this is the last line of it. Here starts the fourth paragraph. This is the second line of this fourth paragraph. And this is the last line of it.
    Output:
    ( ["Here", "starts", "the", "third", "paragraph."], [ "This", "is", "the", "second", "line", "of", "this", "third", "paragraph.", ], ["And", "this", "is", "the", "last", "line", "of", "it."], )
    Update: added an example program.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by Anonymous Monk on Nov 28, 2011 at 07:59 UTC

    Pretend a file is a book that you're holding in your hand

    You would use the same logic as reading a book

    First step is writing that logic down as plain english (this is what the program does, this is how it does it)

    No scratch that, first step, is explaining out loud what the problem is (read book to blah blah), and how you're going to solve it (open book, turn page, read line, blah blah )

    Before you write any code (or perl-like pseudocode) you should at least read perlintro (and try out all the examples).

    See an annotated example of pseudocode

    Based on your post Read block of file and print to html table, I highly recommend reading a few chapters from Beginning Perl

    See also How do I post a question effectively?

Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by ww (Archbishop) on Nov 28, 2011 at 12:25 UTC
Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by ansh batra (Friar) on Nov 28, 2011 at 06:31 UTC

    open file and @lines=<FILE> and process @lines in foreach loop
    track the starting and ending of paragraph using regex
    process the lines if they are the part of paragraph
    split lines considering space as separators
    push splited line into the main array

Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by JediMasterT (Sexton) on Nov 28, 2011 at 15:59 UTC

    Interesting question. It really depends on what starts or ends a paragraph. For example, If one were using MLA style(because I'm in college and am forced to think in MLA style), then split using "\n\t" into an array, then split each paragraph using whatever "period space" is (I forgot) into a two-dimensional array. Then you could look at each sentence arbitrarily.

    Hope that helped!

Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by TJPride (Pilgrim) on Nov 28, 2011 at 10:52 UTC
    Why people are upvoting vinian's obviously broken and insufficient solution, I don't know.

    use strict; use warnings; my ($file, $handle, @arr); open($handle, 'test.txt'); BREAKPOINT: while (<$handle>) { if (m/Here starts the second paragraph/) { chomp; push @arr, $_ for split /\s+/, $_; while (<$handle>) { chomp; last BREAKPOINT if !$_; push @arr, $_ for split /\s+/, $_; } } } print join "\n", @arr;

    Where test.txt contains:

    This is the start of the first paragraph. This is the second line of this first paragraph. And this is the last line of it. Here starts the second paragraph. This is the second line of this second paragraph. And this is the last line of it. Here starts the third paragraph. This is the second line of this third paragraph. And this is the last line of it. Here starts the fourth paragraph. This is the second line of this fourth paragraph. And this is the last line of it.

    And output is:

    Here starts the second paragraph. This is the second line of this second paragraph. And this is the last line of it.
Re: how to read a particular paragraph from middle of a file line by line and enter each string in the line as an element of an array
by vinian (Beadle) on Nov 28, 2011 at 08:23 UTC

    open my $fh, '<', $file or die "failed open file: $!"; my $arr = []; my $beginParagraph = 'begin'; # mark start the particular paragraph my $endParagraph = 'end'; # mark end the particular paragraph while ( <$fh> ){ if ( /$beginParagraph/ .. /$endParagraph/ ) { push @{ $arr->[$.] }, split /\./; # i assume string end '.' } }

    update

    use strict; use warnings; use Data::Dumper; my $arr; my $beginParagraph = 'Here starts the second paragraph'; # mark star +t the particular paragraph my $endParagraph = 'And this is the second line of it'; # mark en +d the particular paragraph while ( <DATA> ){ if ( /$beginParagraph/ .. /$endParagraph/ ) { chomp; push @$arr, split /\s+/; } } print Dumper($arr); __DATA__ This is the start of the first paragraph. This is the second line of this first paragraph. And this is the last line of it. Here starts the second paragraph. And this is the second string. This is the second line of this second paragraph. And this is the second line of it. Here starts the third paragraph. This is the second line of this
    Output:
    $VAR1 = [ 'Here', 'starts', 'the', 'second', 'paragraph.', 'And', 'this', 'is', 'the', 'second', 'string.', 'This', 'is', 'the', 'second', 'line', 'of', 'this', 'second', 'paragraph.', 'And', 'this', 'is', 'the', 'second', 'line', 'of', 'it.' ];