Read Between the Lines

SuzuBell has asked for the wisdom of the Perl Monks concerning the following question:

I have many text files of the same format that I must read. The lines of interest in the text files are between two lines that are consecutive asterisks (************************). What would be an efficient way for me to only consider the lines between these two lines of asterisks?

Comment on Read Between the Lines

Replies are listed 'Best First'.
Re: Read Between the Lines by choroba (Cardinal) on Jul 05, 2013 at 00:21 UTC
You can use the flip flop operator (see Range Operators): `while (<>) { if (/^\+$/ ... /^\+$/) { print "Inside: $_"; } }` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^2: Read Between the Lines by rjt (Curate) on Jul 05, 2013 at 00:53 UTC
I like it. I think the OP didn't want the asterisks themselves printed, so a small addition would clear that up: `while (<>) { print if /^\+$/ ... /^\+$/ and not /^\*+$/; }` [download]	[reply] [d/l]
Re: Read Between the Lines by parv (Parson) on Jul 05, 2013 at 00:32 UTC
Try the paragraph mode given that number of asterisks does not change (see entry for "$/" variable). Do `$/ = "***\n";` for example, and fetch the file chunks in `while` loop.	[reply] [d/l] [select]
Re: Read Between the Lines by rjt (Curate) on Jul 05, 2013 at 01:07 UTC
If your file is small enough (i.e., not many megabytes in size), slurping the whole thing and using a regexp will be efficient enough, and allow you to capture multiple groups, if necessary, without matching unbalanced asterisks. `$_ = do { local $/; <> }; print "Match: $_" for /^\+$ (.+?) ^\+$/smxg;` [download] Full example after the `<readmore>`. Read more... (1063 Bytes)	[reply] [d/l] [select]
Re: Read Between the Lines by sundialsvc4 (Abbot) on Jul 05, 2013 at 01:54 UTC
A more generalized way to handle such requirements is with a Finite-State Machine (FSM) approach. The algorithms consider, not only the current line of input, but the `$state` that the FSM is “in” at the time, where the current value of `$state` is determined by recent history of lines seen. It is probably overkill for a requirement as trivial as this one seems to be,.
Re^2: Read Between the Lines by eyepopslikeamosquito (Archbishop) on Jul 05, 2013 at 12:18 UTC
A more generalized way to handle such requirements is with a Finite-State Machine (FSM) approach. The algorithms consider, not only the current line of input, but the $state that the FSM is “in” at the time, where the current value of $state is determined by recent history of lines seen. Sounds cool. I found a lot of interesting hits using Super Search: Re: Comparing two hashes-help: "I often find it useful to describe logic like this in terms of a finite-state machine (FSM)" Re: Spliting file + removing column: "For this, I use “finite-state machine (FSM)” logic" Re: ADSI groups users: "I naturally look at such problems with an eye toward so-called finite state machine logic" Re: IO::Socket client does not detect when server network connection dies: "Logic like this is sometimes well designed using Finite State-Machine (FSM) logic" Wisdom on how to build a "stressful simulation test" with Selenium & POE: "Each actor is basically an individual finite-state machine" Re: A way to avoid repeated conditional loops: "Call it “a flag variable” if you want to, but this is a classic place for a finite-state machine (FSM) algorithm" Re: Perl/Tk code structure: "A typical design for the shepherd process is a Finite-State Machine (FSM), or more likely, two FSMs" Re: how did blocking IO become such a problem?: "The entire life cycle of a request, and much of the outer request-handling heuristics, is most easily described using a finite-state machine (FSM) algorithm" Re: Clubbing array elements together:: "I prefer to solve such problems using a Finite-State Machine (FSM) algorithm" Re: File Find/Replace with the replacement coming from part of earlier matched string: "This is an absolutely classic case for a “finite-state machine (FSM)” algorithm" Re: How to check if successfully logged in?: "it must be a finite-state machine (FSM) design, because in the final analysis the host web-site is driving the bus ... Fact of the matter is, a production mechanize-script is often two FSMs" Re: Reading concurrently two files with different number of lines: "It might be useful for you to look at the concept of Finite-State Machine (FSM) algorithms as a source of ideas for generalized solutions to these problems" Re: RFC: Simulating Ruby's "yield" and "blocks" in Perl: "these can be used to implement finite-state machines (FSMs)" Re: Selecting HL7 Transactions: "this sort of thing is most-easily handled by finite-state machine (FSM) techniques" Re^2: Too much recursion: "For dealing with very complicated inputs, the notion of a Finite-State Machine (FSM) can be useful" Re: Sorting through a file with multiple tables and extracting data: "The general approach is that of a finite-state machine (FSM)" Re: Is this a simple, robust, and maintainable design?: "sounds like a Finite-state machine. If you haven't coded a finite state machine in Perl before, this article may be helpful." All these glowing endorsements have got me excited, yet I couldn't find any sample code in any of these nodes. So I was wondering if you could post some of the excellent FSM code you've implemented over the years? It would really help me to better understand FSMs.	[reply]

Back to Seekers of Perl Wisdom