Re: Selecting HL7 Transactions

You have a number of issues here. I've included a fair amount of detail below but refer to perlre for the full story.

You're not actually showing a regex but just a fragment of one (I'll assume "/PV1\|1\|O\|(.*?\|){3}\|/"). I'm not trying to be pedantic but I can only respond to what you've written: for all I know, "PV1\|1\|O\|(.*?\|){3}\|" may be part of a larger regex. Also, I have no idea what modifiers, if any, you've used.
The "." in ".*?" matches any character including a pipe ("|") character which isn't what you want. (That's a slight oversimplication: it doesn't match a newline character unless you used the "s" modifier.) So, ".*?" would be better as "[^|]*" (zero or more characters that aren't pipe characters).
You don't anchor the regex so it could match anywhere in the string. To match at the beginning of the string you'll need to prepend "^" or "\A".
You've used capturing parentheses "( ... )" here. This won't break anything as it currently stands but could become an issue if you do want to capture fields later: "(?: ... )" (for clustering, not capturing) would be better.
Purely as a matter of style and personal taste, replacing the escaped pipe "\|" with the character class "[|]" may reduce what's been referred to as backslashitis and improve readability. Either is fine, it's up to you.

Putting all that together, you end up with a few options. Minimal changes would give: "/^PV1\|1\|O\|(?:[^|]*\|){3}\|/".

Having said all that, I'm wondering if splitting the lines on pipe characters might just be a whole lot easier in terms of general readability and future maintenance. Something along these lines:

my @fields = split /[|]/ => $line;
...
if ($fields[0] eq 'MSH' and $fields[8] eq 'ADT^A02') { ... }
...
if ($fields[0] eq 'PV1' and $fields[6] eq '') { ... }
...
[download]

-- Ken

Comment on Re: Selecting HL7 Transactions Select or Download Code

Replies are listed 'Best First'.
Re^2: Selecting HL7 Transactions by BillDowns (Novice) on May 01, 2013 at 23:22 UTC
Thanks, but I guess I did not make it clear - this is a utility script that extracts transactions from an archive based on the regular expressions I give it at run time. That's all it does - extracts transactions to a file. `/PV1\\|1\\|O\\|(.*?\\|){3}\\|/` was one of several regexes evaluated by itself. If all are true, the transaction is extract to an output file. I know about anchors - PV1 segments are a ways into the transaction as I showed in the sample, so I could not use an anchor. The parentheses are used for the repeat factor. All my research on the internet indicates a multi-character pattern that needs to be repeated multiple times should be enclosed in parentheses. Is this not correct?	[reply] [d/l]
Re^3: Selecting HL7 Transactions by kcott (Archbishop) on May 02, 2013 at 00:20 UTC
"The parentheses are used for the repeat factor. All my research on the internet indicates a multi-character pattern that needs to be repeated multiple times should be enclosed in parentheses. Is this not correct?" Here's a test showing clustering and capturing. Both match as expected. Capturing also sets `$1`. $ perl -Mstrict -Mwarnings -E ' my $re1 = qr{PV1\\|1\\|O\\|(?:[^\|]\\|){3}\\|}; my $re2 = qr{PV1\\|1\\|O\\|([^\|]\\|){3}\\|}; my $x = q{PV1\|1\|O\|F3\|F4\|F5\|F6\|F7}; my $y = q{PV1\|1\|O\|F3\|F4\|F5\|\|F7}; say "------- Clustering -------"; say "Match in \$x" if $x =~ /$re1/; say $1 if $1; say "Match in \$y" if $y =~ /$re1/; say $1 if $1; say "------- Capturing -------"; say "Match in \$x" if $x =~ /$re2/; say $1 if $1; say "Match in \$y" if $y =~ /$re2/; say $1 if $1; ' ------- Clustering ------- Match in $y ------- Capturing ------- Match in $y F5\| [download] -- Ken	[reply] [d/l] [select]
Re^4: Selecting HL7 Transactions by BillDowns (Novice) on May 02, 2013 at 01:02 UTC
On a different note, since you seem quite knowledgeable about Perl, and again referring to google searches, non-greedy matching is usually defined in a manner that `(.?\\|)` and `([^\|]\\|)` should be equivalent. And I think so, too. Why are they not?	[reply] [d/l] [select]
Re^5: Selecting HL7 Transactions by kcott (Archbishop) on May 02, 2013 at 02:41 UTC
Re^6: Selecting HL7 Transactions by BillDowns (Novice) on May 02, 2013 at 03:07 UTC
Some notes below your chosen depth have not been shown here
Re^4: Selecting HL7 Transactions by Anonymous Monk on May 02, 2013 at 00:46 UTC
Thanks for that info - I did not know about clustering. It does not show up in the first half-dozen google searches on regular expressions. In this case, it doesn't matter - the utility script doesn't care about any capturing. It is simply qualifying.	[reply]
Re^5: Selecting HL7 Transactions by BillDowns (Novice) on May 02, 2013 at 00:48 UTC

In Section Seekers of Perl Wisdom