I would suggest to set the input separator ($/) to paragraph mode (empty string) and get the product id from the beginning of every paragraph.
Some explanation(if needed). A text file is really just one long string of characters, e.g.:
line 1\nline 2\nline 3\n
By default, perl reads a file line by line, where the definition of a line is to read all the characters up to and including a newline(\n). However, a paragraph is denoted by two newlines(\n\n):
line1\nline2\n\nline1\nline2\n
The double newline is what creates the blank line. Try it: type some text and at the end of the line hit RETURN, then hit RETURN again--you'll get a paragraph. Each time you hit RETURN when you are typing some text, a newline is entered in your text.
Conveniently, perl allows you to change the definition of what a line is. You can tell perl that you want a line to consist of all the characters up to and including two consecutive newlines. That is known as paragraph mode, and you set paragraph mode by setting $/ to a blank string(yeah, it would make more sense to set it to "\n\n", but that's perl.).
The neat thing about being able to set the definition of a line is that you can also read chunks of files that look like this:
aaaaa
bbbb
ccccc
..
ddddd
eeeee
fffffff
ggggg
..
For instance:
use strict;
use warnings;
use 5.012;
$/ = "..\n";
while (my $line = <DATA>) {
say '-' x 20;
print $line;
say '=' x 20;
}
__DATA__
aaaaa
bbbb
ccccc
xx
ddddd
eeeee
fffffff
ggggg
xx
--output:--
--------------------
aaaaa
bbbb
ccccc
..
====================
--------------------
ddddd
eeeee
fffffff
ggggg
..
====================
The other common mode besides paragraph mode is slurp mode. If you set $/ to undef, then perl will read the whole file into a single string. |