http://www.perlmonks.org?node_id=1020900


in reply to Removing Duplicates from a multiline entry

I would suggest to set the input separator ($/) to paragraph mode (empty string) and get the product id from the beginning of every paragraph.

See the code below:
use strict; use warnings; local $/ = ""; my %product; while (<DATA>) { if (/^Product\h+(\d)/) { my $id = $1; my ($address) = /^fullStreet\h*=\h*(.+)/m; if (exists $product{$id}) { print "ID <$id> already exists. Address is <$product{$id}{ +address}>.\n"; # do some other stuff } else { print; } $product{$id} = {address => $address}; } else { warn "Invalid paragraph: <$_>\n"; } } __END__ Product 1 ------------------------------------------------------------------ storeId = 1001 phoneNumber = (111) 111-1111 availbilityCode = 1 stockStatus = Limited stock distance = 9.12 city = some city fullStreet = some address Product 2 ------------------------------------------------------------------ storeId = 2117 phoneNumber = (111) 111-1111 availbilityCode = 2 stockStatus = In stock distance = 7.49 city = some city fullStreet = some address Product 3 ------------------------------------------------------------------ storeId = 2123 phoneNumber = (111) 111-1111 availbilityCode = 1 stockStatus = Limited stock distance = 8.83 city = some city fullStreet = some address Product 1 ------------------------------------------------------------------ storeId = 1001 phoneNumber = (111) 111-1111 availbilityCode = 1 stockStatus = Limited stock distance = 8.56 city = some city fullStreet = some address